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DINUCLEOTIDE RESTRICTION ENDONUCLEASE PREPARATIONS AND METHODS OF 
USE. 

FIELD OF THE INVENTION 

The present invention relates generally to isolated purified 
polynucleotides which encode restriction enzymes and to methods of expressing 
the restriction enzymes from such polynucleotides. More particularly this 
invention relates to isolated purified polynucleotides which encode CvOl and 
related methods for the production of this enzyme. 

Other aspects of ' the invention relate to methods for partially or 
completely digesting DNA at a dinucleotide sequence. More particularly, this 
aspect of the invention relates to methods of generating quasi-random fragments 
of DNA, and methods of cloning, labeling, and sequencing DNA, as well as 
epitope mapping of proteins. The invention also relates to methods for generating 
sequence-specific oligonucleotides from DNA, without prior knowledge of the 
nucleic acid sequence of such DNA, and to methods for cloning and labeling 
DNA after restriction digestion by a two base recognition endonuclease reagent. 
This invention also relates to methods for cloning, labeling, and detecting nucleic 
acids using two base restriction endonuclease reagents, such as CviJ I, BsuR I, 
Aci I or CGase I. Further the invention relates to labeling DNA by taking 
advantage of certain properties of the holo-enzyme of thermostable DNA 
polymerases. 

BACKGROUND OF THE INVENTION 

Restriction endonucleases are a group of enzymes originally found 
to be expressed in a wide variety of prokaryotic organisms. More recently they 
have also been found to be encoded in viral genomes. These enzymes catalyze 
the selective cleavage of DNA at generally short sequences, often unique to the 
individual enzyme. This ability to cleave makes restriction endonucleases 
indispensible tools in recombinant DNA technology. The increased commercial 



WO 94/21663 



PCT/US94/03246 



10 



-2- 

availability of the isolated enzymes has contributed in large part to the enormous 
expansion in the field of recombinant DNA technology over the last few years. 

These enzymes have been classified into three groups. Because of 
properties of the type I and type m enzymes, they have not been widely used in 
molecular biology applications, and will not be discussed further. Type n 
enzymes are part of a binary system known as a restriction modification system 
consisting of a restriction endonuclease that cleaves a specific sequence of 
nucleotides and a separate DNA modifying enzyme that modifies the same 
recognition sequence and thereby prevents cleavage by the cognate endonuclease. 
A total of about 2103 restriction enzymes are known, encompassing 179 different 
type n specificities (Roberts, et al, Nucl Acids Res. 20:2167-2180 (1992)). 
Although there are more than 1200 type II restriction enzymes, many of them are 
members of groups which recognize the same sequence. Restriction enzymes that 
recognize the same sequence are said to be isoschizomers. 

The vast majority of type H restriction enzymes recognize specific 
double-stranded sequences which are four, five, or six nucleotides in length and 
which display twofold (palindromic) symmetry. A few enzymes recognize longer 
sequences or degenerate sequences. 

The location of cleavage sites within a palindrome differs from 
enzyme to enzyme. Some enzymes cleave both strands exactly at the axis of 
symmetry generating fragments of DNA that carry blunt ends, while others cleave 
each strand at similar sequences on opposite sides of the axis of symmetry, 
creating fragments of DNA that carry protruding, single-stranded termini. 

Restriction endonucleases with shorter recognition sequences cut 
25 DNA more frequently than those with longer recognition sequences. For 
example, assuming a 50% G-C content, a restriction endonuclease with a 4-base 
recognition sequence will cleave, on average, every 4 4 (256) bases compared to 
every 4 6 (4096) bases for a restriction endonuclease with a 6-base recognition 
sequence. Under certain conditions some restriction endonucleases are capable 
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of cleaving sequences which are similar but not identical to their defined 
recognition sequence. This altered specificity has been termed "star" (*) activity 
and is observed only under certain non-standard reaction conditions. The manner 
in which an enzyme's specificity is altered depends on the particular enzyme and 
on the conditions employed to induce the star activity. Conditions that contribute 
to star activity include high glycerol concentration, high ratio of enzyme to DNA, 
low ionic strength, high pH, the presence of organic solvents, and the substitution 
of Mg ++ with other divalent cations. The most common types of star activity 
involve cutting at a recognition sequence having a single base substitution, cutting 
at sites having truncation of the outer bases of the recognition sequence, and 
single-strand nicking. The following restriction endonucleases show star activity: 
Ase I, BamH I, BssH H, BsuR I, CviJ I, EcoR I, EcoR V, Hind m, Hinf I, Kpn 
I, Fst I, Pvu H, Sal I, Sea I, Taq I, and Xmn I. Star activity is generally viewed 
as undesirable, and of little intrinsic value. 
15 of me 179 unique type n restriction endonucleases, 31 have a 4- 

base recognition sequence, 11 have a 5-base recognition sequence, 127 have a 6- 
base recognition sequence, and 10 which have recognition sequences of greater 
than 6 bases. In two cases, a restriction endonuclease has a recognition sequence 
of less than 4 bases. 

The restriction enzyme CviJ I has a three base recognition sequence 
or a two-base recognition sequence, depending on the reaction conditions. Under 
normal reaction conditions CviJ I recognizes the sequence PuGCPy (wherein 
Pu=purine and Py=pyrimidine) and cleaves between the G and C to leave blunt 
ends (Xia et al., 1987. Nucleic Acids Res. 15:6075-6090). Under "relaxed" or 
"star" conditions (in the presence of 1 mM ATP and 20 mM DTT) the specificity 
of CviJ I may be altered to cleave DNA more frequently. This activity is referred 
to as CviJ I*, for star or altered specificity. However, CviJ I* activity is not 
observed under conditions which favor star activity of other restriction 
endonucleases. 
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The restriction enzyme BsuR I normally recognizes the sequence 
GGCC and cleaves between the G and C to leave blunt ends. (Heininger, et aJ. , 
Gene 1:291-303 (1977)). Under relaxed conditions (high pH, low ionic strength, 
and high glycerol concentration) the specificity of Bsu RI may be altered to cleave 
DNA more frequently. An isoschizomer of this enzyme, Hae m, does not display 
this star activity. 

In bacteria, the restriction endonuclease provides a mechanism of 
defense against foreign DNA molecules (e.g., bacteriophage DNA) by virtue of 
its ability to distinguish and cleave only exogenous DNA, leaving endogenous 
bacterial DNA unaffected. Viral endonucleases possess the same discerning 
capabilities, but rather than providing a means for defense, this activity has 
presumably evolved to cripple the host's ability to replicate its own DNA and 
allows the virus to assume control of the host's replication machinery. 

Bacteria and viruses which express restriction endonucleases 
necessarily possess the inherent ability to protect their own genome from cleavage 
by their endogenous endonuclease. The primary mechanism by which this is 
accomplished is by modifying the organisms own DNA by, for example 
methylating a base in the recognition sequence which prevents binding and 
cleavage by the endonuclease. Therefore, to insure viability, the genome of an 
organism which expresses a restriction endonuclease is almost always heavily 
modified, usually by methylation of cytosine or adenosine bases. The methylase 
enzyme which modifies the genome (itself a useful tool in molecular biology) acts 
in tandem with the endonuclease, either as part of an enzyme complex 
(restriction/modification complex) or as two distinct entities. Therefore, 
recognizing that an organism expresses an enzyme with endonuclease activity 
strongly suggests the expression of an associated modifying methylase enzyme 
(and vice versa) and this association has led to isolation and cloning of a number 
of commercially available restriction/modification enzymes for use in the 
laboratory as discussed below. 
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One of the limitations in the use of restriction endonucleases exists 
when cleavage of a given sequence is required and no known endonuclease exists 
which is specific for that particular sequence. Therefore, the continued 
identification and isolation of unique restriction endonucleases and altered reaction 
conditions will allow for even more sophisticated manipulation of DNA in vitro. 

A number of publications and patents describe the cloning of DNAs 
encoding restriction endonucleases. Included among theses publications is Kiss. 
A., et al., Nucleic Acid Research 13:6403-6421 (1985), which describes the 
cloned nucleotide sequence of the BsuBl restriction-modification system isolated 
from Bacillus subtillis. This system is specific for the sequence 5 '-GGCC-3 ' and 
is defined by two gene products which are transcribed by different promoters. 
The methylase component of the system shows homology to the methylase from 
the BspSI and SPR restriction-modification systems. 

Nwanko, D.O. and Wilson, G.G. Gene64:l-B (1988), describe the 
cloning and expression of the Mspl restriction and modification genes isolated 
from Moraxella sp. This system recognizes the sequence 5 '-CCGG-3 ' and both 
enzymes are functional in E. coli. Evidence indicates that these genes are 
transcribed in opposite directions, thus are probably under the control of different 
promoters. 

Ashok, K.D., etal, Nucleic Acids Research 20:1579-1585 (1992), 
describe the purification and characterization of cloned Mspl methyltransferase, 
over-expressed in E. coli. At low concentrations the enzyme exists as a 
monomer, but at higher concentrations it exists mainly as a dimer. Polyclonal 
antibodies to the enzyme cross-react with methyltransferase genes of other 
modification systems. 

Brooks, J.E., et al. Nucleic Acids Research 19:841-850 (1991), 
characterizes the cloned BamYO. restriction modification system from Bacillus 
subtilis. The two genes are divergently oriented and separated by an open reading 
frame which may serve as a transcriptional regulator in the native bacteria. 
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Slatko, B.E., et al. Nucleic Acids Research 15:9781-9796 (1987), 
describe the cloning, sequencing and expression of the Taql restriction- 
modification system. These genes have the same transcriptional orientation, with 
themethylasegene5'totheendonucleasegene. E. coli clones which carry' only 
the endonuclease gene are viable even in the absence of the methylase gene. This 
is an unusual case possibly explained by the 65°C optimal temperature for Taql 
restriction and the 37°C optimal temperature for E. coli growth. 

Howard, K.A., et al, Nucleic Acids Research 14:7939-7951 
(1986), describe the cloning of the Ddel restriction modification system from 
Desulfovibrio desulfuricans by a two step method wherein the methylase gene is 
first cloned and transformed into E. coli, followed by the cloning of the 
endonuclease gene and transformation of this second gene into the methylase- 
expressing bacteria. In order to maintain cell viability, high levels of methylase 
expression are required before the endonuclease gene can be introduced into the 
15 bacteria. 

Ito, H., et al, Nucleic Acids Research 18:3903-3911 (1990), 
describe the cloning, nucleotide sequence and expression of theHincJl restriction- 
modification system. The DNA was isolated from H. influenzae Rc, with the two 
genes positioned in the same transcriptional orientation: 

Shields, S.L., et al., Virology 76:16-24 (1990), describe the 
cloning and sequencing of the cytosine methyltransferase gene M.CwJI from the 
Chlorella virus EL-3A. The methylase recognizes the sequence (G/A)GC(T/C/G) 
and shows amino acid sequence homology with 5-methylcytosine methylases 
isolated from bacteria. DNA encoding the methylase was obtained from the viral 
genome which was propagated in the green alga host Chlorella. 

Xia, Y., et al., Nucleic Acids Research 15:6075-6090 (1987) 
discovered that IL-3A virus infection of Chlorella-^ green alga induces the' 
expression of the DNA restriction endonuclease CvOI which has novel sequence 
specificity. This endonuclease recognizes the sequence PuGCPy (wherein Pu = 
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purine and Py = pyrimidine) but does not cut the sequence PuG m CPy, where m C 
is 5-methylcytosine. 

U.S. Patent 5,137,823, issued August 11, i992, to Brooks, J.E., 
describes a two step method for cloning the BaniHl restriction modification 
system wherein the methylase is cloned first and then introduced into a bacterial 
host. The endonuclease is then cloned and introduced into the methylase 
expressing bacteria. This two step procedure provides the host DNA protection 
from cleavage of the subsequently introduced endonuclease. 

U.S. Patent 5,200,333, ('333) issued April 6, 1993, to Wilson, 
G.G., describes a method for cloning restriction and modification genes. 
Specifically this reference describes the cloning of the Taql and HaeU systems 
from Thermus aquaiicus and Haemophilus aegypticus, respectively. In this 
method, bacterial DNA was initially purified and digested, and the fragments 
were then cloned into a vector to produce a bacterial DNA library. The library 
was then transformed into E. coli and the cells were plated. Colonies were then 
scraped from the plate to form a primary cell library. Plasmid DNA from this 
cell library was purified and digested with the endonuclease of the two gene 
system. Bacteria which expressed the methylase gene had modified plasmid DNA 
which was protected from endonuclease activity, while plasmids from bacteria 
which lacked the intact methylase gene were digested. The resulting, undigested 
plasmid DNA was then transformed into another bacterial strain and the bacteria 
were plated. Surviving colonies were again harvested to give a secondary cell 
library and the entire procedure repeated. Plasmids which code for the complete 
restriction-modification system presumably survived each round of purification 
and were enriched. Bacteria which survive several rounds of enrichment were 
subsequently assayed for both methylase and endonuclease activity. 

U.S. Patent 5,196,331, ('331) issued March 23, 1993, to Wilson, 
G.G. and Nwanko, D., describes a method for cloning the Mspl restriction and 
modification genes. This patent describes a method identical to that of U.S. 
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Patent 5,200,333 ('333). '331 is a continuation-in-part of, and '333 is a 
continuation of U.S.S.N. 707,079 (now abandoned). 

As mentioned above, Chlorella virus IL-3A encodes a unique 
restriction endonuclease called CvOl (Xia et al. Nucleic Acids Res. 15:6075-6090 
(1987)). IL-3A is a large, polyhedral, plaque-forming phycodnavirus (Francki, 
R.I.B. , et al. Arch. Virol, suppl.2. Springer-Verlag, Vienna (1991)) that replicates 
in unicellular, eukaryotic green algae, Chlorella strain NC64A (Schuster, A.M., 
etal. Virology 150:170-177(1986)). The double-stranded DNA genome of IL-3 A 
is approximately 330 kbp (Rohozinski et al., Virology 168:363-369 (1989)) and 
contains 9.7% methylated cytidine (Van Etten, J.L. et al., Nucleic Acids Res. 
13:3471-3478 (1985)). The cognate methyltransferase of CviJI, M.Cvtfl, 
methylates (A/G)GC(T/C/G) sequences and, has been cloned and sequenced 
(Shields, S.L. et al., Virology 176:16-24 (1990)). 

The use of a two/three base recognition endonuclease, such as 
CvOl, to improve numerous conventional molecular biology applications as well 
as permitting novel applications has been described in co-pending U.S. Patent 
Application Ser.No. 08/036,481, filed on March 24, 1993. The application 
discloses methods for generating sequence-specific oligonucleotides from DNA 
without prior knowledge of the nucleic acid sequence of such DNA, and to 
methods for cloning and labeling DNA after restriction digestion by a two base 
recognition endonuclease. The application also teaches methods for generating 
quasi-random fragments of DNA, methods for cloning, labeling, and sequencing 
DNA, as well as epitope mapping of proteins. The ability to generate numerous 
oligonucleotides with perfect sequence specificity or quasi-random distributions 
of DNA fragments such as is possible with CviJI* has important implications for 
a number of conventional and novel molecular biology procedures. 

Infection of Chlorella species NC64A with the BL-3A virus 
produces sufficient CwJI restriction endonuclease (CvOl) for research purposes. 
However, production of commercially useful amounts of CviJI is limited with this 
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system due to the slow growth of Chlorella algae, the large number of 
contaminating nucleases associated with the virus, and the small yield of enzyme 
obtained after purification. In addition, biochemical and biophysical 
characterization of the enzyme, such as molecular weight determination, are 
difficult from the native source. Because of these limitations it would be useful 
to clone the gene for CwJI in order to provide an adequate large scale source of 
enzyme for use as a molecular biological reagent. 

SUMMARY OF THE TNVFNTrnM 
In one of its aspects, the present invention provides purified and 
isolated polynucleotides (e.g., DNA sequences and RNA transcripts thereof) 
encoding a unique restriction endonuclease, CwJI, as well as polypeptides and 
variants thereof which display activities characteristic of CwJI. Activities of CwJI 
include the recognition of specific DNA sequences, binding to these sequences 
and cleaving the bound DNA into fragments. Preferred DNA sequences of the 
invention include viral genomic sequences as well as wholly or partially 
chemically synthesized DNA sequences. Replicas (i.e., copies of the isolated 
DNA sequences made in vivo or in vitro) of DNA sequences of the invention are 
also contemplated. A preferred DNA sequence is set forth in SEQ ID NO: 2 
herein and is contained as an insert in the plasmid pCJH1.4. In another of its 
aspects, the invention provides purified isolated DNA encoding a CwJI 
polypeptide by means of degenerate codons. 

Also provided are autonomously replicating recombinant 
constructions such as plasmid DNA vectors incorporating CwJI sequences and 
especially vectors wherein DNA encoding CwJI or a CwJI variant is operatively 
linked to an endogenous or exogenous expression control DNA sequence. 

According to another aspect of the invention, host cells such as 
prokaryotic and eukaryotic cells, are stably transformed with DNA sequences of 
the invention in a manner allowing the desired polypeptides to be expressed 
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therein. Host cells expressing CwJI and CV/JI variant products are useful in 
methods for the large scale production of CVf'JI and CvUl variants wherein the 
cells are grown in a suitable culture medium and the desired polypeptide products 
are isolated from the host cells or from the medium in which the cells are grown. 
A preferred host cell is E. coli. Still another aspect of the invention is a 
recombinant Cvtil polypeptide. 

The present invention is also directed to a method for the digestion 
of DNA with a restriction endonuclease reagent under conditions wherein said 
DNA is cleaved at a dinucleotide sequence selected from the group consisting of 
PyGCPy, PuGCPy, PuGCPu, and wherein Pu = purine and Py = pyrimidine. 

The present invention is also directed to a method for restriction 
endonuclease digestion of DNA comprising the step of digesting DNA with a 
restriction endonuclease reagent under conditions wherein said DNA is digested 
at 11 of 16 possible dinucleotide sequences and wherein said dinucleotide 
sequences are selected from the group consisting of PuCGPu, PuCGPy, and 
PyCGPu, and wherein Pu = purine and Py = pyrimidine. 

The present invention is directed to shotgun cloning of DNA, 
epitope mapping, and for labeling DNA using the digestion methods of the present 
invention. The present invention provides methods for quasi-random fragmenting 
of DNA using the digestion methods of the present invention under conditions 
wherein the DNA is only partially cleaved and the site preference of the 
restriction endonuclease reagent is greatly reduced. By quasi-random is meant an 
overlapping population of DNA fragments produced by digesting DNA using the 
methods of the present inventions without apparent site-preference and which 
appears as a smear upon electrophoresis in a 1-2 wt. % agarose gel. The present 
invention is also directed to the shotgun cloning and sequencing of quasi-random 
fragments of DNA produced by the methods of the present invention. Quasi- 
random fragments in the shotgun cloning method of the present invention are 
produced by partial digestion of DNA with a restriction endonuclease reagent 
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according to the methods of the present invention. More particularly, quasi- 
random fragments of DNA useful in the cloning method of the present invention 
are produced by the partial digestion of the DNA to be cloned with CviJ I, BsuR 
I or with a restriction endonuclease reagent termed CGase I comprising Taq I and 
Hpa n. Quasi-random fragments having a length of between about 100 and about 
10,000 nucleotides are preferred. More preferred are quasi-random fragments of 
about 500 to about 10,000 nucleotides in length. The present invention is also 
directed to the generation of quasi-random fragmentation of DNA using the 
method of the present invention for the purposes of epitope mapping and gene 
cloning. These quasi-random fragments are expressed either in vitro or in vivo 
and the smallest fragment containing the desired function is identified by 
screening assays well known in the art. 

The present invention is also directed to the production of 
anonymous primers from any DNA without prior knowledge of the nucleotide 
15 sequence. The present invention provides methods for anonymous primer cloning 
and sequencing after complete digestion of DNA utilizing CviJ I, BsuR I or 
CGase I using the methods of the present invention. 

Additionally, the present invention is directed to methods of 
labeling and detecting DNA comprising the complete digestion of DNA using the 
20 methods of the present invention, followed by a heat denaturation step, to yield 
sequence specific oligonucleotides. In particular, an aspect of the present 
invention involves labeling DNA with sequence specific oligonucleotides of about 
20 to about 200 bases in length (with an average size of between 20-60 bases) 
generated by CviJ I, BsuR I or CGase I digestion of the template DNA 
25 More particularly, the invention is directed to restriction generated 

oligonucleotide labeling (RGOL) of DNA which comprises the digestion of an 
aliquot of template DNA with CviJ I followed by a simple heat denaturation step, 
thereby generating numerous sequence specific oligonucleotides, which can then 
be utilized for labeling nucleic acids by a number of methods, including primer 
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extension type reactions with a DNA polymerase and various labels, isotopic 
ornon-isotopic (RGOL-PEL); 5' end labeling with polynucleotide kinase: 3' end 
labeling using terminal transferase and various labels.isotopic or non-isotopic. 
Labeling at the 3' end, also referred to as tailing, adds numerous labels per 
oligonucleotide (1-200), depending on the labeling conditions. The addition of 
10-500 oligonucleotides generated per template, results in a significant signal 
amplification not obtainable by conventional methods. 

The invention is also directed to thermal cycle labeling (TCL) 
which comprises the simultaneous labeling and amplification of probes utilizing 
CviJ I or CGase I restriction generated oligonucleotides as the starting material. 
In this method, natural DNA of unknown sequence is digested with CviJ I to 
generate numerous double-stranded fragments which are then heat denatured to 
yield oligonucleotides. These oligonucleotides are combined with the intact 
template and subjected to repeated cycles of denaturation, annealing, and 
15 extension in the presence of a thermostable DNA polymerase or functional 
fragment thereof which maintains polymerase activity, deoxynucleotide 
triphosphates and the appropriate buffer. Alpha 32 P-dATP (or any of the other 
three deoxynucleotide triphosphates), biotin-dUTP, fluorescein-dUTP, or 
digoxigenin-dUTP is incorporated during the extension step for subsequent 
20 detection purposes. Thermal cycle labeling efficiently labels DNA while 
simultaneously amplifying large amounts of the labeled probe. In addition, TCL 
probes exhibit a 10 fold improvement in detection sensitivity compared to 
conventional probes. 

The present invention is also directed to TCL in which the 
25 thermostable DNA polymerase supplies endogenous primers for enzymatic 
extension. This method is referred to as Universal Thermal Cycle Labeling 
(UTCL). In this method natural DNA of unknown sequence is combined intact 
with the holo^nzyme of a thermostable DNA polymerase, deoxyribonucleotide 
triphosphates, and the appropriate buffer. The holo-enzyme and its associated 
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endogenous primers are then combined with intact template and subjected to 
repeated cycles of denaturation annealing and extension. Alpha 32 P-dATP, 32 P- 
dTTP, 32 P-dGTP, 32 P-dCTP, biotin-dUTP, fluorescein-dUTP, or digoxigenin- 
dUTP is also included in the extension step for subsequent detection purposes. 
5 Isotopic labels useful in the practice of the present invention include but are not 
limited to 32 P, 33 P> 35 s> 14 c md 3 H ^.^^ labels usefill M ^ 

invention include but are not limited to fluorescein biotin, dinitrophenol and 
digoxigenin. 

The present invention is also directed to an improved method for 
10 purifying CviJ I from the algae CMorella infected with the virus IL-3A. 

In addition the present invention is directed to restriction 
endonuclease reagents which, under conditions which relax the sequence 
specificity of one or more restriction endonucleases, cleave DNA at the 
dinucleotide sequences AT or TA. 

The present invention is also directed to a restriction endonuclease 
reagent comprising in combination, Taq I and Hpa n, which is capable of 
digesting DNA at 11 of 16 possible dinucleotide sequences, said sequences 
selected from the group consisting of PuCGPu, PuCGPy, PyCGPy and PyCGPu, 
and wherein Pu = purine and Py = pyrimidine. 

The following examples are intended to be illustrative of the several 
aspects of the present invention and are not intended in any way to limit the scope 
of any aspect of the present invention. 

BRIEF DESC RIPTION CiV THE DRAWTWlfi 

Figure 1 is a map of the plasmid p710 which contains DNA 
25 sequences encoding for the IL-3A viral methyltransferase M. CwJI; 

Figure 2 is the nucleotide sequence of 5497 bp of cloned IL-3A 

viral DNA; 
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Figure 3 is a restriction map of the cloned JJL-3A viral DNA, 
including the identified open reading frames; 

Figure 4 is the DNA sequence of the CVOI gene with its flanking 
regions. The predicted amino acid sequence is provided below the nucleotide 
sequences; 

Figure 5A depicts the theoretical frequency and distribution of 
CWI* restriction generated oligomers of individual lengths; Figure 5B shows the 
actual frequency and distribution of CWJI* restriction generated oligomers of 
various lengths; 

Figure 6 is a flow chart depicting anonymous primer cloning; 
Figure 7 is a photographic reproduction of a gel depicting CWJI 
restriction digests of pUC19; 

Figure 8 is a photographic reproduction of a gel depicting 
comparisons of sonicated versus CviJl* partially digested DNAs; 

Figure 9A is a photographic reproduction of an agarose gel 
electrophoresis analysis of size-fractionated DNA by microcolumn 
chromatography compared to fractionation by agarose gel electroelution; 

Figure 9B-E illustrates additional trials of the same procedures 
used in Figure 9A; 

Figure 10A illustrates the size distribution of DNA fragments 
produced by partial digestion of DNA by CWJI and fractionated by microcolumn 
chromatography; 

Figure 10B-C illustrates the size distribution of DNA fragments 
produced by partial digestion of DNA by CWJI and fractionated by agarose gel 
25 electrophoresis; 

Figure 1 1 is a schematic depiction of the distribution of CWJI sites 
inpUC19;and 

Figure 12 is a graph of the rate of sequence accumulation by 
Cw'JI shotgun cloning and sequencing. 
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DETAILED D ESCRTPTTOV 

The gene for the restriction endonuclease R.CvUI was cloned into 
E. coli so as to provide an adequate source of R.CVOT f or use as a molecular 
biological reagent. Biologically active Cm has been purified from E.coli to 
apparent homogeneity. The molecular weight of E.coU derived RCVfll is 32.5 
kD by SDS gel electrophoresis. N-terminal amino acid sequence analysis of this 
protein and comparison to the nucleotide sequence of the gene revealed that the 
translation of this enzyme is probably initiated with a GTG start codon, instead 
of the usual ATG initiation codon. The structural gene is 834 nucleotides in 
length coding for a protein of 278 amino acids (31.6 kD). A second peak of 
R.CVin activity which elutes separately from the 32.5 kD form can be seen in the 
initial stages of enzyme purification. Trace amounts of a larger molecular weight 
form have not been observed to date. However, the R.CVfll gene does possess 
an in-frame upstream ATG codon which if translated would yield a predicted 41.4 
kD protein. The structural gene for this potentially larger product is 1074 
nucleotides in length coding for a putative protein of 358 amino acids. 

The present invention is also directed to a method for the 
fragmentation and cloning of DNA using the restriction endonuclease CviJ I under 
conditions which allow the enzyme to cleave DNA at the dinucleotide sequence 
GC. In addition, the present invention is also directed to the cloning of quasi- 
random fragments of DNA digested using the fragmentation method of the present 
invention. 

As an alternative to the methods for constructing random clone 
libraries described above, methods were devised for the construction of such 
libraries which require fewer steps and reagents, which require smaller amounts 
of DNA, which have relatively high cloning efficiencies and which takes less time 
to complete. These methods relate to the recognition that a partial digest with a 
two or three base recognition endonuclease cleaves DNA frequently enough to be 
functionally random with respect to the rate at which sequence data may be 
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accumulated from a shotgun clone bank. The restriction enzyme CviJ I normally 
recognizes the sequence PuGCPy and cleaves between the G and C to leave blunt 
ends (Xia et al, Nucl. Acids Res. 15:6075-6090 (1987)). Under "relaxed" 
conditions (in the presence of 1 mM ATP and 20 MM DTT) the specificity of 
CviJ I can be altered to cleave DNA more frequently and perhaps as frequently 
as at every GC. This activity is referred to as CviJ I*. Because of the high 
frequency of the dinucleotide GC in all DNA (16 bp average fragment size for 
random DNA), quasi-random libraries may be constructed by partial digestion of 
DNA with CviJ I*. A DNA degradation method with low levels of sequence 
specificity produces a smear of the target DNA when analyzed by agarose gel 
electrophoresis. Digestion of the plasmid pUC19 under partial CviJ I* conditions 
does not result in a non-discrete smear; rather, a number of discrete bands are 
found superimposed upon a light background of smearing, suggesting that CviJ 
I has some site preference. Atypical reaction conditions according to the present 
invention eliminate this apparent site preference of CviJ I* to produce an activity 
(termed CviJ I**) in combination with a rapid gel filtration size exclusion step, 
streamlines a number of aspects involved in shotgun cloning. 

One aspect of the present invention involves the use of the 
two/three base recognition endonuclease CviJ I, in conjunction with a simple spin- 
column method to produce libraries equivalent in final form to those generated by 

the combination of sonication and agarose gel electroelution. However the 

» 9 

method of the present invention requires fewer steps, a shorter time period, and 
significantly less substrate (nanogram amounts) when compared to conventional 
procedures. Both small and large sequencing projects using the methods 
described herein are within the scope of the present invention. 

Current sequencing paradigms require the generation of a new 
template for each 350-500 nucleotides sequenced. On this basis, sequencing both 
strands of the human genome would require at least 12 million templates 500 
nucleotides long, assuming no overlap between templates. 
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A random approach, such as shotgun sequencing, would require 30 
to 50 million templates, assuming the entire genome were randomly subcloned. 
As many as 250,000 libraries may be needed to generate the requisite templates 
from a subcloned and ordered array of this genome, depending on the type of 
vector utilized, and the degree of overlap between such clones. The ability to 
generate shotgun libraries in a semi-automated, microliter plate format would 
greatly simptify such large scale projects. 

The development of methods for cloning large DNA molecules in 
yeast artificial chromosomes (Burke et al, Science 236:806-812 (1987), or in 
bacteriophage Pl-derived vectors (Sternberg, Proc. Natl. Acad. Sci. USA 87:103- 
107 (1990)), simphfies the subdivision and analysis of very large genomes. 
However, the large size of the resulting subclones (100 - 1000 kbp) presents 
additional challenges for subsequent sequencing efforts. A report of the 
sequencing of a 134 kbp genome by random shotgun cloning directly into a 
bacteriophage M13 vector indicates that numerous intermediate stages of 
subcloning, mapping, and overlapping such clones may be ebminated (Davison, 
J. DNA Seq. and Mapping 1:389-394 (1992). An order of magnitude reduction 
in the amount of DNA required for shotgun cloning would substantially simplify 
efforts to directly sequence 100,000 bp sized molecules and beyond. 

The ability to generate an overlapping population of randomly 
fragmented DNA molecules is considered essential for minimizing the closure of 
nucleotide sequence gaps by the shotgun cloning method. The use of a very 
frequent-cutting restriction enzyme, such as CviJ I, is an approach which has not 
been utilized. Reaction conditions according to the present invention result in the 
quasirandom restriction of pUC19 and lambda DNA, as judged by the degree of 
smearing observed. 

The randomness of this CviJ I** reaction was quantified by 
sequence analysis of 76 such partially-fragmented pUC19 subclones. The analysis 
is showed that CviJ I~ partial digestion (limiting enzyme and time) restricts DNA 
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at PyGCPy, PuGCPu, and PuGCPy (but not PyGCPu), and is thus a hybrid 
reaction which combines the three base recognition specifity of CviJ I with the 
"two" base recognition specifity of CviJ I*. Interestingly, most of the "relaxed" 
cleavage observed under CviJ I** conditions occurred in those portions of the 
sequence which were deficient in "normal" restriction sites. CviJ I** treatment 
produces a relatively uniform size distribution of DNA fragments, permitting 
sequence information to be accumulated in a statistically random fashion. 

Shotgun cloning with CviJ I** digested DNA is efficient partly 
because the resulting fragments are blunt ended. Other methods currently used 
to randomly-fragment DNA, including sonication, DNAse I treatment, and low 
pressure shearing, leave ragged ends which must be converted to blunt ends for 
efficient vector ligation. Other than a heat denaturation step to inactivate the 
endonuclease, no additional treatments are required for cloning CviJ I** restricted 
DNA. In addition, the preligation step required to equalize representation of the 
15 ends of a DNA molecule prior to sonication or DNAse I treatment is not 
necessary with CviJ I** fragmentation. CviJ I* cleaves its cognate recognition 
site very close to the ends of a linear molecule, as judged by the very small 
fragments resulting from complete digestion of pUC19 as depicted in Figure 2, 
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The overall efficiency of shotgun cloning depends not only on the 
fragmentation process, but also upon the size fractionation procedure used to 
remove small DNA fragments. The efficiency of cloning agarose gel fractionated 
DNA was found to be unexpectedly variable. Numerous experiments produced 
an erratic distribution of sized material and the resulting cloned inserts were 
25 uniformly small (70% < 500 bp in one trial, 100% < 500 bp in another). The 
method of the present invention includes a simple and rapid micro-column 
fractionation method, which has resulted in three to thirteen times more 
transformants than agarose gel fractionation. More importantly, the size 
distribution of the cloned inserts from column-fractionated DNA was skewed 
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toward larger fragments (88% > 500 bp). Micro-column fractionation also 
eUminates the chemical extraction steps required for agarose fractionated DNA. 
After the target DNA has been column-fractionated, no further treatments are 
required for cloning. Combining CviJ I** partial restriction with micro-column 
fractionation permits the construction of useful libraries from as little as 200 ng 
of substrate, an order of magnitude less starting material than recommended for 
sonication/end-repair and agarose gel fractionation procedures. 

The CviJ I reaction represents a unique alternative for controlling 
the partial digestion of DNA, a technique which is fundamental to the construction 
of genomic libraries (Maniatis et al. Cell 15:687-701 (1978), and restriction site 
mapping of recombinant clones (Smith, et al. Nucl. Acids Res. 3:2387-2398 
(1976). Partial DNA digests are notably variable and are strongly dependent on 
the concentration and purity of the DNA, the amount of enzyme used, the 
incubation time, and the batch of enzyme. Partial digestions may also be variable 
with respect to the rate at which a particular recognition sequence is cleaved 
throughout the substrate. Optimal reaction conditions, such as those which render 
such partial digests independent of one or more of these variables, allows more 
precise control of the end product. Several controlling schemes may be 
employed, including: the addition of a constant amount of carrier DNA (Kohara 
etal, Cell 50:495-508 (1987)), the use oflimiting amounts of Mg2+ (Albertson 
et al. Nucl. Acids Res. 17:808 (1989)), ultraviolet irradiation (Whitaker, et al 
Gene 41:129-134), and the combination of a restriction enzyme and a sequence 
complementary DNA methylase (Hoheisel et al, Nucl. Acids Res. 17:9571-9582 
(1989)). Utilizing three different batches of CviJ I, and three different DNA 
templates from five separate preparations, a uniform CviJ I** partial digestion 
pattern was obtained that was primarily time-dependent when a constant ratio of 
0.3 units of enzyme per fig of DNA was used. 

The rate at which a particular restriction site is cleaved at different 
locations in a substrate is variable for many endonucleases (Brooks, et al. , 
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Methods in Enzymol, 152:113-129 (1987)). Reaction conditions for CviJ I may 
be optimized to substantially reduce the site preferences of this enzyme during 
partial digestion (see Figure 2, lanes 3 and 4). Normally, "star" reaction 
conditions result in cleavage at new sites. The use of star reaction conditions 
according to the present invention (dimethyl sulfoxide [DMSO] and lowered ionic 
strength) to affect the partial digestion activity of CviJ I** does not result in an 
altered restriction site cleavage as assayed by sequencing the products of 76 
digestion reactions. Instead, the relative rate of cleavage of individual sites 
appears to be more uniform under these conditions. A 3-5 fold increase in the 
rate of normal CviJ I restriction with the standard buffer and DMSO further, 
substantiates this approach. All of these results indicate that, under the 
appropriate reaction conditions, CviJ I is useful for a number of other 
applications, such as high resolution restriction mapping and fingerprinting, 
diagnostic restriction of small PCR fragments, and construction of genomic DNA 
libraries. 

Another aspect of the present invention involves quasi-random 
fragmentation of DNA using the method of the present invention for epitope 
mapping and cloning intact genes. The same method as described above for 
shotgun cloning is utilized, except that an expression vector is used to generate 
functional proteins from the DNA. 

Another aspect of the present invention involves fragmenting DNA 
using the present invention to generate multiple oligonucleotides from any double- 
stranded DNA template. Restriction-generated oligonucleotides (RGO) are 
sequence specific oligonucleotides generated from any DNA according to the 
present invention. CviJ I* presumably cleaves the recognition sequence GC 
between the G and C to leave blunt ends (Xia et al, Nucl. Acids Res. 15:6075- 
6090, (1987)). Because of the high frequency of dinucleotide GC in all DNA 
(16bp average fragment size for random DNA), a complete CviJ I* restriction 
results in numerous fragments which are about 20-200 bp in size. These 
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restriction fragments are generated from an aliquot of the template itself and are 
heat-denatured to yield numerous single-stranded oligonucleotides which are of 
variable length but which are specific for the cognate template. Complete CviJ 
I* restriction of the small plasmid pUC19 (2689 bp) theoretically yields 314 
oligonucleotides after a heat-denaturation step. The ability to generate numerous 
oligonucleotides with perfect sequence specificity is an unusual result of the use 
of this class of enzyme according to the present invention. Such oligonucleotides 
are uniquely suited for purposes of labeling DNA, as described below. 

One application of CviJ I* restriction-generated oligonucleotides is 
to directly label them using conventional methods. There are several important 
advantages in using CviJ I* restriction-generated oligonucleotides. Conventional 
methods employing synthetic oligonucleotides for detection purposes generally use 
one oligonucleotide containing one or a few labels. A complete CviJ I* digest 
generates hundreds of oligonucleotides from a given template, depending on the 
15 size of the template, and thus makes hundreds of sites available for labeling, 
regardless of the labeling scheme utilized. These hundreds of sequence specific 
restriction-generated oligonucleotides have two important advantages over 
conventional probes used in nucleic acid detection methods. First, the generation 
of multiple oligonucleotide probes directed at multiple sites in a given target 
20 (theoretically, 314 sites in pUC19) provides enhanced detection sensitivities 
compared to synthetic oligonucleotides which are directed at 1 or a few sites in 
a target. The numerous labeled restriction-generated oligonucleotides represent 
a 10-100 fold amplification of the signal for detection compared to the use of a 
single oligonucleotide. Second, the short length of the restriction-generated 
25 oligonucleotides permits more efficient hybridization. This is important for two 
reasons. First, hybridization times using restriction-generated oligonucleotides is ' 
reduced to 1 hr as opposed to an overnight incubation with conventional probes 
hundreds of nucleotides in length. This is a very important advantage when using 
oligonucleotide probes in clinical settings. Second, the penetration of probes into 
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permeabilized cells is a critical issue for in situ hybridization procedures. The 
smaller the probe, the easier the entry into the cell. Thus, the use of multiple 
oligonucleotide probes generated by the two base cutters greatly improves the 
sensitivity of in situ hybridization, a technique of considerable importance in 
5 research and clinical labs. Finally, when using membrane-based hybridization 
procedures, only small sections of a target nucleic acid are exposed and available 
for hybridization. Multiple oligonucleotides derived from a cognate template 
exhibit better detection sensitivities compared to long probes. 

Another application of restriction-generated oligonucleotides for 
10 labeling is to employ them as primers in a polymerase extension labeling reaction 
in conjunction with a repetitive thermal cycling regimen of denaturation, 
annealing, and extension. Thermal Cycle Labeling (TCL) is a method for 
efficiently labeling double-stranded DNA while simultaneously amplifying large 
amounts of the labeled probe. The TCL system employs the two base recognition 
15 endonuclease CvU I* to generate sequence-specific oligonucleotides from the 
template DNA itself. These oligonucleotides are combined with the intact 
template and subjected to repeated cycles of denaturation, annealing, and 
extension by a thermostable DNA polymerase from, for example, Thermus flavus. 
A radioactive- or non-isotopically-labeled deoxynucleotide triphosphate is 
20 incorporated during the extension step for subsequent detection purposes. The 
amplified, labeled probes represent a very heterogeneous mixture of fragments, 
which appears as a large molecular weight smear when analyzed by agarose gel 
electrophoresis. Primer-primer amplification, a side product of this reaction 
(produced by leaving out the intact template in the TCL reaction), may result in 
enhanced detection sensitivity, perhaps by forming branched structures. Biotin- 
labeled probes generated by the TCL protocol detect as tittle as 25 zeptomoles 
(2.5 x 10- 20 moles) of a target sequence. A 50 p\ TCL reaction yields as much 
as 25 fig of labeled DNA, enough to probe 25 to 50 Southern blots. After 20 
cycles of denaturation and extension, biotin-dUTP-incorporated TCL probes may 



25 



PCT/US94/03246 



-23- 

be routinely detected at a 1:10 6 dilution, which is 1000 fold more sensitive than 
RPL, and indicates that a significant degree of net synthesis or amplification of 
the probe is occurring. In addition, non-isotopically-labeled TCL probes exhibit 
a 10-fold improvement in detection sensitivity when compared to RPL-generated 
probes. 32 P-labeled probes generated by the TCL protocol may also detect as 
little as 50 zeptomoles (2.5 xlO" 20 moles) of a target sequence. As little as 10 
pg of template DNA is enough to synthesize 5-10 ng of radioactive version of 
TCL generates probes having extremely high specific activities, e.g. (about 5 x 
10 9 cpm//ig DNA), which permits 5 to 10-fold lower detection limits than 
conventional labeling protocols. 

There are several advantages to using restriction-generated 
oligonucleotides for primer extension labeling of DNA. One advantage is the 
specificity of the primers. All of the oligonucleotides generated by the TCL 
system are specific for the template utilized, unlike random primer labeling (RPL) 
which utilizes synthetic oligonucleotides 6-9 bases in length having a random 
sequence. The amount of primer required for efficient labeling with the TCL 
system is only 10 ng, compared to the 10 /xg of random primers utilized for RPL. 
Due to their short length, random primers anneal very inefficiently above 25- 
37°C, thus RPL is limited to DNA polymerases such as Klenow or T7. The size 
of the restriction-generated oligonucleotides are longer than the random primers, 
which extends the hybridization and extension conditions to include a wide variety 
of temperatures and polymerases. Thus, the use of the restriction-generated 
sequence-specific oligonucleotides results in more efficient hybridization and 
extension as compared to RPL. The TCL system has been optimized for labeling 
with a thermostable DNA polymerase which allows the option of temperature 
cycling. After 20 cycles of denaturation and extension, a significant amount of 
amplified TCL probes can be generated. Most importantly, TCL-labeled probes 
exhibit a 10 fold improvement in detections sensitivity when compared to RPL- 
generated probes. 
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Another aspect of the present invention involves a variation of TCL 
called Universal Thermal Cycle Labelling (UTCL) in which the extension primers 
are not supplied by CviJI restriction, but rather, are found endogenously in the 
enzyme preparations of thermostable DNA polymerases. Random sequence DNA 
is usually co-purified along with the holo-enzyme preparation of the thermostable 
DNA polymerases, regardless of the source of the enzyme, i.e. native or cloned. 
However, only the holo-enzyme, and not the exonuclease minus deletion variants, 
contain the endogenous DNA. Typically, when the holo-enzymes of thermostable 
polymerases are used in protocols such as the polymerase chain reaction, the 
presence of such primers can create spurious results. Methods for circumventing 
the problems of endogenous DNA are described in PCR Protocols: A Guide to 
Methods and Applications, Eds. M. Innis, et al., Academic Press, 1990. 

This residual DNA is rather short (approximately 5-25 bases), as 
assayed by end-labeling with y^prATP] ^ polynucleotide kinase and acts as 
15 endogenous "random" primers in a TCL-type reaction. UTCL combines the holo- 
enzyme of a thermostable polymerase from, for example, Thermits flavus, with 
the intact DNA template and is subjected to repeated cycles of denaturation, 
annealing, and extension. A radioactive- or non-isotopicaUy-labeled 
deoxynucleotide triphosphate is incorporated during the extension step for 
subsequent detection purposes. The amplified, labeled probe represents a very 
heterogenous mixture of fragments, which appears as a large molecular weight 
smear when analyzed by agarose gel electrophoresis. Biotin-labeled probes 
generated by the UTCL protocol detect as Utile as 25 zeptomoles (2.5 x 10" 20 
moles) of a target sequence. A 15 M l UTCL reaction yields as much as 5-10 M 
of labeled DNA, enough to probe 5 to 10 Southern blots. After 20 cycles of 
denaturation and extension, biotin-dUTP-incorporated UTCL probes may be 
routinely detected at a 1:10 6 dilution, which is 1000 fold more sensitive than 
RPL, and indicates that a significant degree of net synthesis or amplification of 
the probe is occurring. In addition, non-isotopically-labeled UTCL probes exhibit 
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a 10-fold improvement in detection sensitivity when compared to RPL-generated 
probes. 32 P-labeled probes generated by the UTCL protocol may also detect as 
little as 50 zeptomoles (2.5 xlO" 20 moles) of a target sequence. The radioactive 
version of UTCL generates probes having extremely high specific activities, e.g. 
(about 5 x 10 9 cpm//ig DNA), which permits 5 to 10-fold lower detection limits 
than conventional labeling protocols. 

The present invention is illustrated by the following examples 
relating to the isolation of a full length viral DNA clone encoding RCwJI, to the 
expression of R.CwJI DNA in E.coli strain DH5aF'MCR and to purification of 
R.CwJI from this bacterial stain. More particularly, Example 1 provides for the 
propagation of EL-3A virus and isolation of viral genomic DNA. Example 2 
addresses the improved expression of a clone for the viral methylase M.CwJI . 
Example 3 describes the strategy for isolating and cloning the viral R.CwJI gene 
by a forced co-cloning strategy of the M.CW7I gene. Example 4 describes the 
sequencing of cloned IL-3A genomic DNA and identification of the R.CwJI gene. 
Example 5 relates the methods for purification of CwJI to homogeneity from an 
E.coli strain, DH5arF'MCR, transformed with a plasmid which encodes the 
R.CwJI enzyme. Example 6 details the amino acid sequence analysis of the 
purified R.CwJI enzyme. Example 7 describes the analysis of CwJI* recognition 
sequences. Example 8 relates to a technique for producing restriction generated 
oligonucleotides using CwTI. Example 9 relates the generation of anonymous 
primers using CwJI. Example 10 describes end-labeling of CwJI restriction 
generated oligonucleotides. Example 11 describes primer extension labeling of 
DNA using restriction generated oligonucleotides. Example 12 relates the use of 
CwJI in thermal cycle labeling of DNA as well as the method of universal thermal 
cycle labelling. Example 13 provides a method for generation of quasi-random 
DNA fragments using CwJI. Example 14 describes fractionation of CwJI digested 
DNA by size using spin column chromatography. Example 15 details the relative 
cloning efficiency of Cv/JI digested, size-fractionated DNA by gel elution and 
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chromatographic methods. Example 16 describes the comparison of cloning 
efficiency using lambda DNA fragmented by both sonication and CVai** 
techniques. Example 17 details the use of Cm** fragmentation for shotgun 
cloning and sequencing. Example 18 describes the shotgun cloning of lambda 
DNA using CvOI. Example 19 describes the use of Cm in epitope mapping 
techniques. Example 20 describes the restriction endonuclease reagent CGase I. 

Example 1 
Propagation of IL-3A Virus 

The exsymbiotic Chlorella-like alga, NC64A, originally isolated 
from Paramecium bursaria (Karakashian, S.J. and Karakashian, M.W., Evolution 
and Symbiosis in the Genus Chlorella and Related Algae. Evolution 19:368-377 
(1965)), was grown and maintained in Bold's basal medium (BBM), (Nichols, 
H.W. and Bold, H.C. J. Phycol. 1:34-38 (1965)) modified by the addition of 
0.5% sucrose, 0.1% protease peptone, and 20 M g/ml tetracycline (MBBM). 
Cultures were inoculated with 1 X 10 6 algae cells/ml and grown at 25°C in 250 
ml of MBBM in 500 ml Erlenmeyer flasks on a rotary shaker (150 rpm) in 
continuous light (ca. 30 /iEi, m^sec" 1 ). Growth was monitored by light 
scattering measured as A 640nm and/or by direct cell counts with a 
hemocytometer. 

When the cultures reached approximately 1 X 10 7 algae cells/ml 
they were inoculated with filter sterilized (0.4 m nitrocellulose filter, 
Nucleopore, Pleasanton, California) IL-3A virus at a multiplicity of infection of 
0.01 and incubated for an additional 48 - 72 hours at 25°C. The crude lysate was 
then centrifuged at 3000 rpm (2000 xg) for 10 minutes to remove cellular debris. 
Nonidet P-40 was then added to 1% (v/v) and the virus was pelleted from the 
supernatant by centrifuging at 15,000 rpm at 4°C for 75 minutes in a Beckman 
No. 30 rotor. The viral pellet was gently resuspended in 0.05 M Tris-HCl, pH 
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7.8, and the sample was layered on linear 10 - 40% sucrose gradients equilibrated 
with 0.05 M Tris-HCl, P H 7.8, and centrifuged for 20 minutes at 20,000 rpm at 
4°C in a Beckman SW28 rotor. The viral band, which was present in the center 
of the gradient as an opaque band, was removed, diluted with 0.05 M Tris-HCl, 
pH 7.8, and pelleted by centrifugation at 15,000 rpm at 4°C for 120 minutes in' 
a Beckman No. 80 rotor. The virus was resuspended in a small volume (10ml) 
of 0.05 M Tris-HCl, pH 7.8, and stored at 4°C. 

IL-3A viral DNA was purified from the viral particles using a 
modification of the protocol described by (Miller, S.A., Dykes, D.D., and 
Polesky, H.I., Nucleic Acids Res. 16:1215 (1988)). Briefly, 100 M l of IL-3A 
virus (9.8 X 10 11 plaque forming units/ml) was dUuted with 400 jd of water and 
then mixed with 10 pi TEN (0.5 M Tris-HCl, pH 9.0, 20 mM EDTA, 10 mM 
Nad) and 10 M l of 10% SDS. After incubating at 70°C for 30 minutes the 
solution was extracted twice with phenol-chloroform-isoamyl alcohol, extracted 
once with chloroform, and precipitated with ice-cold ethanol using methods well 
known in the art and resuspended in 500 M l of H 2 0. (Ausubel, F.M., Brent, R., 
Kingston, R.E., Moore, D.D., Seidman, J.G., Smith, J.A. and Struhl, K. (Eds.) 
(1987) Current Protocols in Molecular Biology, Wiley, New York; Sambrook, J., 
Fritsch, E.F. and Maniatis, T. (1989), Molecular Cloning: A Laboratory Manual, 
Cold Spring Harbor Laboratory Press, Cold Spring Harbor, New York). 

Example 2 
CviJI Methyltransferase Clone 

The CviSl methyltransferase gene (M.CwJI) from Chlorella virus 
IL-3A was cloned and sequenced by Shields et al, Virology 176:16-24 (1990). 
Briefly, SauZA partial digest of Chlorella virus IL-3A was ligated to Bamm 
digested pUC19 and transformed into E. coli strain RR1. This library of plasmids 
was restricted with Hindm (AAGCTT) and Sstl (GAGCTC), both of which are 
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inhibited by 5-methylcytidine (5mC) in the AGCT portion of their recognition 
sequences, and transformed again into RR1 cells. M. CwJI methylates the internal 
cytidine in (G/A)GC(T/C/G) sequences. If the M.CwJI gene is cioned and 
expressed appropriately, the plasmid DNA would be expected to be resistant to 
Hindm and SstI restriction. 

The CwJI methyltransferase gene was originally cloned as a 7.2 kb 
insert, termed pIL-3A.22. Plasmid pIL-3A.22 was only partially resistant to CwJI 
digestion. Partial digestion is most likely due to the inefficient expression of the 
M.CwJI gene and the numerous CwJI sites in both the vector (pUC19 has 45 
CwJI sites) and in the insert DNA. The M. CWJI gene was eventually sublocalized 
to a region of 3.7 kb by subcloning using methods well known in the art 
(Ausubel, F.M., Brent, R. f Kingston, R.E., Moore, D.D., Seidman, J.G., Smith, 
J. A. and Struhl, K. (Eds.) (1987) Current Protocols in Molecular Biology, Wiley, 
New York; Sambrook, J., Fritsch, E.F. and Maniatis, T. (1989), Molecular 
Cloning: A Laboratory Manual, Cold Spring Harbor Laboratory Press, Cold 
Spring Harbor, New York ) and testing the subcloned DNA for 
sensitivity/resistance to Hindm, Sstl, and CwJI. (Shields et al., supra) The 
entire sequence was determined and three open reading frames which could code 
for polypeptides 161, 367, and 162 amino acids, respectively, were identified. 
The 367 amino acid open reading frame (ORF) was identified as the M. CwJI gene 
by three criteria: (i) it is the only ORF located in the region identified by 
transposon mutagenesis; (ii) it has amino acid motifs similar to those of other 
cytosine methyltransferases; and (iii) a 1.6 kb Dral fragment containing the 367 
amino acid ORF (1101 bp) produces the methyltransferase. This 1.6 kb M.CwJI 
encoding fragment was subcloned into the EcoRV site of pBluescript KS(-) 
(Stratagene, LaJoUa, CA), in the same translational orientation as the lacZ ' gene 
of this vector. A physical map of the resulting plasmid termed p710 is shown in 
Figure 1. 
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The plasmid p710 was digested with several endonucleases to 
indirectly test the efficiency of M.CwJI expression. Fully active methylase should 
render the plasmid DNA completely resistant to digestion by the following 
enzymes: Haem (which recognizes the sequence GGCC), Sacl (which recognizes 
the sequence GAGCTC), and Hindm (which recognizes the sequence AAGCTT). 
The plasmid was partially resistant to Haem (90%) and Sacl (90%), and even less 
resistant to Hindm (25%) digestion. This lack of complete protection of the 
plasmid DNA made it impractical to attempt cloning the three/two base restriction 
endonuclease encoded by the R. CvrJI gene. Thus, improvements in the efficiency 
of M.CwJI expression were required before attempting to clone the R. CwJI gene. 

The translation efficiency of the M.CVOI gene was improved by 
removing extraneous 5' open reading frames, creating a perfect fusion of the 
lacZ ' Shine-Delgarno sequence with the methyltransferase start codon (see Figure 
1). This was achieved by site-specific oligonucleotide mutagenesis, using the 
15 oligomer 

5 '-CAATTTCACACAGGAAACAGCrATGTCTTTTCGCACGTTAGAAC-3 ' 
(SEQ ID NO: 1) to precisely remove the intervening lacZ' DNA. The relevant 
DNA sequences are indicated in Figure 1 (SEQ ID NO: 12). The mutagenesis was 
facilitated by converting the double stranded plasmid DNA of p710 to single- 
stranded DNA by co-infecting the E. coli host strain with the helper phage R408 
(Russel, M., Kidd, S. and Kelly, M.R. Gene 45:333-338), using methods well 
known in the art. The mutagenesis reaction was completed using a commercially 
available kit according to the manufacturer's instruction (Mutagene, Bio-Rad, 
Hercules, California). The oligonucleotide was annealed to the single-stranded 
plasmid, extended in the presence of T4 DNA polymerase, ligated using T4 DNA 
ligase, and transformed into competent SURE™ cells (Stratagene, La Jolla, 
California). Transformed cells were then grown overnight as a pool, the DNA 
isolated and purified. 
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Enrichment for the mutagenized plasmids was made possible by 
virtue of the loss of an Xhol site located in the sequence that was deleted by 
mutagenesis. Enrichment was accomplished by digesting the isolated, purified 
plasmid DNA with Xhol, followed by dephosphorylation with calf intestinal 
alkaline phosphatase (CIAP), and transformed into SURE cells. Plasmid DNA 
was isolated from 18 individual colonies and the DNA tested for resistance to 
Xhol. Plasmid DNA from 1 1 colonies were resistant to Xhol digestion, indicating 
that they lacked the deleted sequence. Five of these plasmids were restricted with 
Haem, Hindm, PvuEL (which recognizes the sequence CAGCTG), and CWJI. All 
five appeared 100% resistant to these enzymes. Four of the plasmids were 
sequenced and the deletion was confirmed as being correct. One of these, 
pBMC5, was chosen for further modification. 

Example 3 
Forced Co-Cloning of R.Cvtfl 

The location of the R.CWJI gene on the IL-3A virus genome was 
inferred as being 3' to the M.CWJI gene for two reasons: 1) the cloned DNA 
sequence 5' to the M.CWJI gene did not produce a restriction activity; and 2) 
several attempts to clone the DNA 3' to the M.CWJI gene resulted in 
deletions/rearrangements of this downstream region. This information permitted 
a forced co-cloning strategy to obtain the restriction endonuclease gene. This 
strategy uses a deletion derivative of pBMC5 lacking the 3 ' half of the M. CWJI 
gene. Digestion of the IL-3A genome with the same enzyme used to create the 
M.CviJI deletion, followed by ligation of the respective DNAs, transformation, 
and digestion with enzymes incapable of recognizing methylated DNA (e.g., 
Haem, Hindm, PvuU, CWJI, etc.) should force the selection of clones which 
have a restored M.CviJI gene (and thus active methylase enzyme), as well as 
downstream DNA. Thus, if a clone is found to be CWJI resistant, the 3 ' half of 
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M.CVOI must have been restored, and downstream DNA containing the R.CWJI 
gene, at least in part, would presumably be cloned. 

The details of this cloning strategy are as follows. pBMC5 has two 
ficoRI sites, one approximately in the middle of the M. CvOl gene, while the other 
5 site lies in the vector DNA, 3 ' to the M.CMJI gene (see Figure 1). pBMC5 was 
restricted with EcoRI and ligated at a dilute concentration (10-50 ng//d) to favor 
circularization without the 3 ' M.CwJI fragment. The reaction mixture was then 
transformed into competent SURE cells and plated on TY agar containing 
ampicillin. Plasmid DNA from the resulting colonies was tested for the lack of 
10 this Ecom fragment by digestion with EcoRl. One of these clones, pBMC5RI, 
was used for the subsequent co-cloning work. Plasmid pBMC5RI was digested 
with Ecom and dephosphorylated using CIAP. IL-3A genomic DNA was then 
digested to completion with Ecom. The Ecom digested pBMC5RI and IL-3A 
DNAs were combined at a ratio of 1:3 in a ligation reaction using T4 DNA 
ligase, and the products of the ligation reaction were subsequentiy used to 
transform competent SURE cells. The pBMC5RI/IL-3A transformants were not 
plated, but rather grown overnight in culture as a library or pool of cells. The 
cells were harvested the next day and DNA was isolated and purified. Isolated, 
purified DNA was digested with Haem, dephosphorylated with CIAP, and 
transformed into competent SURE cells. The cells were then plated and grown 
overnight. Six colonies grew, of which only one containing the plasmid, 
pCJHl.4, was resistant to Haem. The plasmid pCJH1.4 was found to encode 
CviJI restriction activity. Plasmid pCJH1.4 was further characterized to localize 
the gene for CvOl by deletion analysis, subcloning experiments, and sequencing. 
The plasmid pCJH1.4 was deposited with the American Type Culture Collection 
on June 30, 1993 under Accession Number 69341. 
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Example 4 

Sequencing of Cloned IL-3A DNA Containing CviJI Gene 

The Eam fragment cloned into pCJH1.4 (as described in Example 
3) is 4901 bp in length. Except for the 519 bp corresponding to the 3 ' portion 
of the M. CviJI gene, the remainder of the 4901 bp EcoR I fragment cloned into 
pCJHl.4 was sequenced using the SEQUAL DNA Sequencing System 
(CHIMERx, Madison, WI) by methods well known in the art. Sequencing was 
accomplished using three approaches: 1) primer walking on pCHJ1.4, 2) cloning 
various restriction endonuclease digests of pCHJ1.4 into an M13 type sequencing 
vector; and 3) sequencing various restriction endonuclease deletion derivatives of 
pCHJl.4. The nucleotide sequence of 5497 bp of IL-3A viral DNA is shown in 
rigure 2 and set forth in SEQ ID NO.: 2. 

Six open reading frames (ORF) of 1155 bp (ORF1), 468 bp 
(ORF2), 555 bp (ORF3), 1086 bp (ORF4), 397 bp (ORF5) and 580 bp (ORF6) 
which could code for polypeptides containing 358 (41.4 kD), 156 (19.4 kD), 185 
(20.3 kD), 362 (38.9 kD), 132 (14.5 kD) and 193 (21.9 kD) amino acids, 
respectively, were identified (see Figure 3). ORFs 4-6 do not code for the 
R.CviJI gene, as the deletion derivative P CdA12, which lacks the DNA between 
the Aval and Bamm sites (see Figure 3), does produce CviJI restriction 
endonuclease activity. In addition, the deletion derivative pCdEB7, lacking the 
DNA between the EcoBl and BamHI sites, did not produce Cm activity. Thus 
ORF1 or ORF3 were the most likely candidates for encoding the R.CviJI gene 
The sequence of the 1155 bp ORF1 (SEQ U> NO: 3), its deduced amino acid 
sequence (SEQ ID NO: 4) (as shown in capital letters), plus flanking bases, is 
presented in Figure 4. The vertical line in Figure 4 and the associated arrow 
indicate where the DNA sequence from pJCH1.4 diverges from that of pIL- 
3A.22-8 (Shields, S.L., et al., Urology 76:16-24, 1990). This open reading 
(ORF1) frame is believed to represent the CviJI gene because 14 out of 15 N- 
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terminal amino acids from the protein sequence (see Example 6) matched the 
predicted translation product of the nucleic acid sequence (Figure 4). Also, the 
32.5 kD molecular weight of the homogeneously purified enzyme described in 
Example 5 matched the predicted translation product of the nucleic acid sequence 
(31.6 kD) if the encoded protein was translated beginning at the GTG codon 
located at nucleotides 299 - 301 (Figure 4), instead of the 5 ' ATG codon located 
at nucleotides 59-61. This possibility is not surprising in light of the fact that 
approximately 10% of prokaryotic and eukaryotic gene products begin translation 
with a GTG start codon, rather than the usual ATG codon (Kozak, M., Microbiol. 
Rev. 47:1-45 (1983); Kozak, M. J.Cell.Biol 108:229 (1989); Gold, L. et ai, 
Anm.RevMicrobiol. 35:365-403 (1981)). The structural gene was identified to 
be 834 nucleotides in length, coding for a protein of 278 amino acids (31.6 kD) 
and is set forth in SEQ ID NO: 4. It is also interesting to note that the Cm gene 
was shown to possess an in-frame, upstream ATG codon which if translated could 
yield a protein with a predicted molecular weight of 41.4 kD (Figure 4). A larger 
molecular weight form possessing CV/JI restriction activity has not been detected 
by SDS gel electrophoresis. However, a second peak of Cm activity which 
eluted separately from the 32.5 kD form was detected in the initial stages of 
enzyme purification. The DNA sequence which could theoreticaUy code for a 
larger form of CV/JI would be approximately 1074 nucleotides in length (assuming 
it starts at the upstream ATG codon) and would code for a protein of 358 amino 
acids. 



Example 5 

Purification of Recombinant CwJI Restriction Endonuclease 

Initially, 20 ml of LB medium (plus 100 jxg/ml ampicillin) were 
inoculated with a 1 ml stock of E. coli transformed with the plasmid pCJH1.4 
described above and grown overnight at 37°C with shaking. The next day, 20 ml 
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of this initial overnight culture was used to inoculate another 1 liter of LB 
medium and grown overnight. The following day, 50 liters of TB medium (12 
g Bacto-Tryptone, 24 g Bacto Yeast Extract, 4 ml glycerol, 2.31 g KH 2 P0 4 , 
12.54 g K 2 HP0 4 , 0. 1 g MgS0 4 , 100 jig/ml ampicillin, and water to 1 liter) were 
inoculated with an aliquot of the secondary overnight culture and grown at 37°C 
with 20 Iiters/min aeration at 200 RPM, until the OD 595nm reached 1.0 unit. 
Vigorous aeration was essential for CVfJI expression and a typical yield contained 
70 g of cell paste after centrifugation. 

The cell pellet was immediately resuspended in lysis buffer A 
(30 mM Tris-HCl, pH 7.9 at 4 6 C, 2 mM EDTA, 10 mM beta-mercaptoethanol, 
50 /ig/mlphenylmethylsulfonyl fluoride (PMSF), 20 /zg/ml benzamidine, 2 M g/ml 
O-phenantroline, 0.7 M g/ml pepstatin) at a volume of 3 ml of buffer A per 1 g of 
ceils. The cell suspension was then passed through a Manton-Gaulin cell 
disrupter (Gaulin Corporation, Everett, MA) twice and centrifuged for 1 hr (8000 
RPM, Sorvall GS3 Rotor) at 4°C. To the supernatant, solid NaCl was added to 
a final concentration of 200 mM, and 10% polyethyleneimine (PEI) solution 
slowly added to a final concentration of 1%. The mixture was stirred for 3 hr, 
and then centrifuged 30 min, at 4°C, 8000 RPM (Sorvall GS3 Rotor). Solid 
ammonium sulfate was then added to the supernatant at 0.5 g/ml and the mixture 
was stirred overnight at 4°C. The precipitated proteins were centrifuged for 1 nr. 
(8000 RPM, Sorvall GS3 Rotor) at 4°C and the resulting pellet dissolved in 
100 ml of buffer B (10 mM K/P0 4 , pH 7.2, 0.5 mM EDTA, 10 mM beta- 
mercaptoethanol, 50 mM NaCl, 10% glycerol, 0.05% Triton X-100, 50 M g/ml 
PMFS, 20 /xg/ml benzamidine, 2 M g/ml o-phenanthroline, 0.7 ^g/ml pepstatin). 
The dissolved protein solution was then dialysed (14kD cut-off) for 12 hours 
against three 1 liter changes of buffer B. The dialyzed solution was then diluted 
to 600 ml with buffer B and apptied to a 5 x 20 cm phosphocellulose Pll 
(Whatman) column (flow rate 100 ml/hr). 
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The column was then washed with 1.5 liter of buffer B followed 
by a 0 - 1.5 M NaCl gradient in buffer B (5 liters). R.CvOI eluted at 
approximately 600 mM NaCl. The active fractions were then pooled and 
concentrated to 50 ml with a 76 mm Amicon YM10 membrane. The resulting 
solution was then diluted to 300 ml with buffer C (20 mM Tris-acetate, pH 7.4 
at 4°C, 2mM EDTA, 10 mM beta-mercaptoethanol, 50 mM NaCl, 10% 
glycerol, 0.01% Triton X-100, 50 Mg/ml PMFS, 20 Mg/ml benzamidine, 2 M g/ml 
o-phenanthroline, 0.7 ^g/ml pepstatin) and applied to 2.5 x 7 cm Heparin- 
Sepharose column at a flow rate of 25 ml/hr. 

After a 400 ml wash with buffer B, R.CViJl was eluted with a 
1.5 liter gradient of 0 - 1.3 M NaCl in buffer C. CvOl eluted at approximately 
400 mM NaCl. The most active fractions were pooled and applied to a 
2.5 x 7 cm Blue-agarose column equilibrated in buffer D (20 mM Tris-acetate pH 
8.0, 1 mM EDTA, 7 mM beta-mercaptoethanol, 30 mM NaCl, 10% glycerol, 
0.01% Triton X-100, 50Mg/ml PMFS, 20 pg/ml benzamidine, 2 M g/ml 
o-phenanthroline, 0.7 ng/ml pepstatin). After a 500 ml wash with buffer D, Cv/JI 
was eluted with a 0 - 1.5 M NaCl gradient (1.5 1) in buffer D. Active fractions 
were dialyzed against buffer G (10 mM K/P04 pH 7.0 (4°C), 10 mM beta- 
mercaptoethanol, 50 mM NaCl, 10% glycerol, 0.01% Triton X-100, 50 ftg/nd 
PMFS, 20 jtg/ml benzamidine, 2 /tg/ml o-phenanthroline, 0.7 /ig/ml pepstatin) 
and loaded (20 ml/h) onto a ceramic HTP column (American International 
Chemical, Natick MA) (1.5 x 3 cm), equilibrated in buffer F (20 mM Tris-HCl 
pH 8.0, 0.5 mM EDTA, 3 mM DTT, 50 mM K-acetate, 5 mM Mg acetate, 50% 
glycerol). After washing with 100 ml of buffer F, a 400 ml gradient 0 - 0.9 M 
K/P0 4 in buffer F was run. The HTP column was washed with buffer G, 
containing 3 mg/ml BSA, then with 1 M phosphate buffer and reequilibrated in 
buffer G. The active fractions were then pooled and concentrated using a TM10 
membrane to a final volume of 3 - 4 ml. This concentrate was then applied to a 
2.5 x 95 cm Sephadex G-100 column, equilibrated in buffer E (20 mM Tris-HCl 
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pH 7.5 (4°C), 5 mM Mg-Acetate, 2 mM EDTA, 10 mM beta-mercaptoethanol, 
100 mM NaCl, 5% glycerol, 0.01% Triton X-100, 50 M g/ml PMFS, 20 M g/ml 
benzamidine, 2 /*g/ml o-phenanthroline, 0.7 /tg/ml pepstatin) at a flow rate of 
6 ml/hr, and 3 ml fractions collected. Active fractions were dialyzed against 
5 storage buffer F. 

The molecular weight of the purified CwJI was determined by 
comparison to known protein standards on a denaturing 10% SDS polyacrylamide 
gel and a single band migrating with an apparent molecular weight of 32.5 
kilodaltons was seen indicating that by these criteria, CWJI was purified to 
10 homogeneity. 

Example 6 

N-Terminal Amino Acid Sequence of R.CWJI 

To confirm that the restriction endonuclease encoded by the insert 
in pCJH1.4 was CV/JI the sequence of the first 15 N-terminal amino acids of 
15 purified CwJI was determined by the Edman degradation method using an Applied 
Biosystems (Foster City, CA) 477A Liquid Phase Protein Sequencer with an on- 
line 120A PTH Analyzer. The results of that analysis are shown in Table 1. 
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Tablel 

N-Terminal Amino Acid Analysis of CviJI 

Amino Retention pmol Pmol Pmol Pmol Amino Acid ID 
Acid# Time (Raw) (-bkgd) (+lag) Ratio 
(min) 



1 


9.17 


6.11 


3.86 


5.10 


34.53 


THR, MET, 
ARG, OR LYS 


2 


10.32 


3.92 


1.54 


1.82 


9.96 


GLU 


3 


10.33 


4.28 


2.22 


2.18 


11.96 


GLU 


4 


27.37 


2.23 


1.49 


1.72 


7.64 


LYS 


5 


27.35 


2.37 


1.66 


1.67 


7.39 


LYS 


6 


17.95 


3.37 


2.76 


2.81 


9.48 


ARG 


7 


28.10 


3.19 


1.73 


2.08 


6.09 


LEU 


8 


13.58 


3.58 


2.11 


2.49 


12.08 


ALA 


9 


28.10 


3.23 


1.68 


1.58 


4.63 


LEU 


10 


18.17 


0.71 


0.78 


0.36 


1.21 


ILE 


11 


10.30 


1.65 


0.78 


0.96 


5.26 


GLU 


12 


9.72 


8.03 


0.41 


1.31 


3.25 


LYS 


13 


8.53 


1.54 


0.53 


0.55 


2.97 


GLN 


14 


18.18 


2.19 


1.74 


1.67 


5.63 


ARG 


15 


26.80 


3.33 


0.43 




0.89 


ILE 



20 Abbreviations used: threonine (THR), methionine (MET), arginine (ARG) lysine 
(LYS), glutamic acid (GLU), leucine (LEU), alanine (ALA), isoleucine (ILE) and 
glutanune (GLN). 



The results of this analysis confirm that the protein encoded by the 
DNA insert in pCJH1.4 (ORF1) is CviJI. 
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The following Examples illustrate some of the unique properties of 
and important uses for CviJI. 



Example 7 
Analysis of CviJI* Recognition Sequences 

The CviJI* recognition sequence (see Xia, et a!., Nuc. Acids Res. 
J5: 6025-6090, 1987) was deduced by cloning and sequencing CviJI* digested 
pUC19 DNA fragments. A complete CviJI* digest of pUC19 was ligated to an 
M13mpl8 cloning derivative for nucleotide sequence analysis. The sequence of 
the entire insert was read in order to determine which sites were or were not 
utilized. A total of 100 clones were sequenced, resulting in 200 CviJI* restricted 
junctions, the data for which are compiled in Table 2. 
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The dinucleotide GC is found at 205 sites in pUC19. These GC 
sites (shown in Table 2) can be divided into four classes based on their flanking 
Pu/Py structure, the normal recognition sequence (N) and three potential classes 
of relaxed sites (R2 and R3). As seen in Table 2, the fraction of such NGCN 
5 sites which belong to each classification is roughly equal (22.0%-27.8%). A total 
of 200 CwJI restricted junctions were analyzed by sequencing 100 cloned inserts. 
If CwJI cleaved at all NGCN sites without sequence preferences, it would be 
expected that the fraction of each classification should be restricted approximately 
equally. Instead, most of the sites cleaved by this treatment were found to be 

10 normal, or PuGCPy sites (47.5%). Rl (PyGCPy) and R2 (PuGCPu) restricted 
sites were found at nearly the same frequency (25.5% and 27.0%, respectively). 
Out of 200 CwJI junctions, no R3 (PyGCPu) restricted sites were found. Thus, 
CwJT cleaves all NGCN sites except for PyGCPu. As CwJI* cleaves 12 out of 
16 possible NGCN sites, it may be referred to as a 2.25-base recognition 

15 endonuclease. 

In addition to the restricted sites, those sites which were not cleaved 
by CwJI conditions were also compiled for analysis, as shown in Table 2. A 
total of 116 non-cleaved NGCN sites were found in the 100 inserts which were 
sequenced. PyGCPu sites represented the largest class of non-cleaved sites 
20 (52.6%). In only two cases were PuGCPy sites found not to be cleaved. An 
approximately equal fraction of Rl and R2 sites were not cleaved as were found 
cleaved (22.4% versus 25.5% for Rl and 23.3% versus 27.0% for R2). Based 
on the frequency of cleavage, or lack thereof, a hierarchy of restriction under 
CwJI conditions is evident, where PuGCPy > > PuGCPu = PyGCPy. 
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Example 8 

CvQI* Restriction Generated Oligonucleotides 

Due to the high frequency of CwJI or CwJI* restriction, it is 
possible to generate useful oligonucleotides by digestion and a heat denaturation 
step as described above. The size and number of the resulting oligonucleotides 
are important for subsequent applications such as those described above. If for 
example, an oligonucleotide is to be used with a large genome, it has to be long 
enough so that the sequence detected has a probability of occuring only once in 
the genome. This minimum length has been calculated to be 17 nucleotides for 
the human genome (Thomas, C.A., Jr. Prog. NucL Acid Res. Mol Biol, 5:315 
(1966)). Oligonucleotides used for sequencing or PCR amplification are generally 
17-24 bases in length. Oligomers of shorter length will often bind at multiple 
positions, even with small genomes, and thus will generate spurious extension 
products. Thus, an enzymatic method for generating oligomers should ideally 
result in polymers greater than 18 bases in length. 

The theoretical number of pUC19 CwJI* restriction-generated 
oligomers is 314 (157 CV/JI* restriction fragments x 2 oligomers/fragment), the 
size distribution of which is shown in panel A of Figure 5. Most of the expected 
CV/JI* restriction-generated oligomers (about 75%) are smaller than 20 bp. This 
assumes that CwJI is capable of restricting DNA to very small fragments, the 
shortest of which would be 2 bp. However, in practice, about 93 % of the cloned 
CwJI* fragments were 20-56 bp in size, and 3% of the fragments generated by 
CwJI* were smaller than 20 bp (panel B of Figure 5). This suggests that CwJI* 
is not able to bind or restrict those fragments below a certain threshold length. 
Since the smallest observed fragment is 18 bp, it may be assumed that this length 
is the minimal size which can be generated from a given larger fragment. 
Whatever the reason for this phenomenon, CwJI* treatment of DNA produces a 
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relatively small range of oligomers (mostly 20-60 bases in length), most of which 
are a perfect size class for molecular biology applications. 

Example 9 
Anonymous Primer Cloning 

Primers are critical tools in many molecular biology applications 
such as PCR, sequencing, and as probes. Anonymous primers are useful as 
sequencing primers for genomic sequencing projects, as probes for mapping 
chromosomes, or to generate oligonucleotides for PCR amplification. 

The Anonymous Primer Cloning (APC) method is a variation of 
shotgun cloning in that unknown sequences of DNA are being randomly cloned. 
However, unlike CviJl shotgun cloning, wherein a partial CwJI** digest of DNA 
is cloned, anonymous primer cloning utilizes a complete CviJI* digest to restrict 
large DNAs into small fragments 20-200 bp in size. These small fragments are 
cloned into a unique vector designed for excising the anonymous DNA as labeled 
15 primers. The strategy for this method is illustrated in Figure 6. 

As illustrated in Figure 6, the APC strategy reduces large DNAs 
to small fragments, which are cloned and excised for use as primers. Plasmid 
pFEM has a unique arrangement of the restriction sites for MboU and Fold, which 
permits DNA cloned into the EcoRV site to be excised without associated vector 
DNA. This is possible because Fold cleaves 9/13 bases to the left of the 
recognition site shown in pFEM and MboB. cleaves 8/7 bases to the right of the 
recognition site shown in pFEM, which is well into the cloned anonymous 
sequence. After MboU or Fokl restriction, a known flanking primer is annealed 
(primer 1 or 2) and extended using a DNA polymerase and dNTPs. The. ™Wr 
is previously end-labeled, or alternatively, one or more 
radioactive. 
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After denaturation of the newly synthesized DNA and separation 
from its cognate template, the labeled anonymous primer is ready for use in 
sequencing the original template from which it was subcloned. The presence of 
the pFEM vector sequence fused to the anonymous sequence does not influence 
the enzymatic extension of this primer from its unique binding site, as the vector 
DNA is at the 5' end and the unique sequence is located at the 3' end (all 
polymerases extend 5' to 3'). Both the top and bottom strand primers may be 
excised from pFEM due to the symmetrical placement of restriction sites and 
flanking primer binding sites. Thus, two primers may be derived from each 
cloning event. APC is particularly well suited to the genomic sequencing strategy 
of Church and Gilbert Proc Natl Acad Set USA 81:1991-1995 (1984), although 
its utility is not limited thereto. 

Example 10 

End Labeling of Restriction-Generated Oligonucleotides 

As is clear from the foregoing examples, digesting DNA with 
CV/JI* provides the ability to generate sequence-specific oligonucleotides ranging 
in size from 20-200 bases in length with an average length of 20-60 bases. 
Sequence specific oligonucleotides generated by Cv/JI* digestion may be labeled 
directly at the 5'-end or at the 3'-end using techniques well known in that art. 

For example, 5'-end labeling may be accomplished by either a 
forward reaction or an exchange reaction using the enzyme T4 polynucleotide 
kinase. In the forward reaction, 32 P from [y^PJATP is added to a 5' end of an 
oligonucleotide which has been dephosphorylated with alkaline phosphatase using 
standard techniques widely known in the art and described in detail in Sambrook 
et a!., Molecular Cloning: A Laboratory Manual, 2nd Edition. Cold Spring 
Harbor Laboratory Press (1989). In an exchange reaction, an excess of ADP 
(adenosine diphosphate) is used to drive an exchange of a 5'-terminal phosphate 
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from the sequence specific oligonucleotide to ADP which is followed by the 
transfer of 32 P from y^P-ATP to the 5'-end of the oligonucleotide. This 
reaction is also catalyzed by T4 polynucleotide kinase and is decribed in 
Sambrook et al. , Molecular Cloning: A Laboratory Manual, 2nd Edition. Cold 

5 Spring Harbor Laboratory Press (1989). 

Homopolymeric tailing is another standard labeling technique useful 
in the labeling of CwJI -generated sequence specific oligonucleotides. This 
reaction involves the addition of 32 P-labeled nucleotides to the 3 '-end of the 
sequence specific oligonucleotides using a terminal deoxynucleotide transferase. 

0 (Sambrook et al , Molecular Cloning: A Laboratory Manual 2nd Edition. Cold 
Spring Harbor Laboratory Press (1989)). 

Commonly used labeling techniques typically employ a single 
oligonucleotide directed to a single site on the target DNA and containing one or 
a few labels. Oligonucleotides generated by the method of the present invention 

5 arc directed to many sites of a target DNA by virtue of the fact that they are 
generated from a sample of the target sequence. Thus, the hybridization of 
multiple oligonucleotides (labeled by the methods described above) allows a 
significantly enhanced sensitivity in the detection of target sequences. In addition, 
the short length of the labeled oligonucleotides used in the methods of the present 

3 invention allows a reduction in hybridization time from overnight (as is used in 
conventional methods) to 60 mins. 

Although labeling sequence specific oligonucleotides with 32 P is 
described above, labeling with other radionucleotides, and non-radioactive labels 
is also within the scope of the present invention. 
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Example 11 
Primer Extension Labeling of DNA Using 
Restriction-Generated Oligonucleotides (PEL-RGO) 

Another aspect of the present invention includes methods for 
labeling DNA which include the generation of oligonucleotide primers by 
complete digestion with CWJI*, followed by heat denaturation. PEL-RGO 
requires three steps: 1) generating the sequence-specific oligonucleotides by CWJI* 
restriction of the template DNA; 2) denaturation of the template and primer; and 
3) primer extension in the presence of labeled nucleotide triphosphates. Plasmid 
DNA may be prepared by methods known in the art such as the alkaline lysis or 
rapid boiling methods (Sambrook et ol. Molecular Cloning: A Laboratory 
Manual, 2nd Edition). Cold Spring Harbor Laboratory Press, Cold Spring 
Harbor, New York (1989)). In addition, the vector should be linearized to ensure 
effective denaturation. A restriction fragment may be labeled after separation on 
low melting point agarose gels by methods well known in the art. 

In PEL-RGO labeling, template DNA to be labeled is divided into 
two aliquots; one is used to generate the sequence specific oligonucleotide primers 
and the other aliquot is saved for the primer annealing and extension reaction. 
A typical reaction mix for generating sequence-specific oligonucleotides is 
assembled in a microcentrifuge tube and includes: 100 ng DNA; 2 yX 5x CWJI* 
buffer; 0.5 & CWJI (lu/^1); sterile distilled water to 10 nl final volume. CWJI* 
5X restriction buffer includes: 100 mM glycylglycine (Sigma, St. Louis, 
Missouri, Cat. No. G2265) pH adjusted to 8.5 with KOH, 50 mM magnesium 
acetate (Amresco, Solon, Ohio, Cat. No. P0013119), 35 mM /S-mercaptoethanol 
(Mallinckrodt, Paris, Kentucky, Cat. No. 60-24-2), 5 mM ATP, 100 mM 
dithiothreitol (Sigma, St. Lous, Missouri, Cat. No. D9779) and 25% v/v DMSO, 
(Mallinckrodt Cat. No. 67-68-5). CWJI is obtained from CHIMERx (Madison! 
Wisconsin). The reaction mix is incubated at 37°C for 30 min, followed by the 
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inactivation of CwJI by heating at 65°C for 10 min. The CwJI*-restricted DNA 
may be used directly without further purification, or it may be stored at -20°C for 
several months for subsequent labeling reactions. 

After heat-inactivating CwJI, 0.2 M g of the digested and undigested 
DNA are electrophoresed on a 1.5% agarose gel, using a suitable molecular 
weight marker for comparison. The CwJI restriction fragments appear as a low 
molecular weight smear in the 20-200 bp range. 

By way of example, 1-10 ng of linearized pUC19 was labeled under 
the conditions described below. A template-primer cocktail was prepared by 
mixing 10 ng of linearized pUC19 DNA template with 20 ng pUC19 sequence- 
specific oligonucleotides (prepared as described above) and the mixture is brought 
to a final volume of 17 M l with sterile distilled water. The template-primer 
mixture is denatured in a boiling water bath for 2 minutes and immediately placed 
on ice. 

The following labeling mixture is then added to the template-primer 
mix:2.5 pi 10X labeling buffer (500 mM Tris HC1 at pH 9.0, 30 mM MgCl 2 , 
200 mM (NH 4 ) 2 S0 4 , 20/tM dATP, 20 M M dTTP, 20/xM dGTP, 0.4% NP-40); 
5.0 pi [or- 32 P] dCTP (3000Ci/mmol, 10^0/^1 New England Nuclear, Catalog 
No. NEG013H); 0.5 fil Thermus flavus DNA polymerase (5u//tl) (Molecular 
Biology Resources, Milwaukee, Wisconsin); up to 25 M l final volume with 
distilled water. The reaction was incubated at 70°C for 30 min and then stopped 
by adding 2/d of 0.5M EDTA at pH 8.0 to the reaction mix. 

The efficiency of the labeling reaction is gauged by the percentage 
of radioisotope incorporated into labeled DNA. One microliter of the labeling 
reaction is added to 99 fil of lOmM EDTA in a microcentrifuge tube. This serves 
as the source of diluted probe for total and trichloroacetic acid (TCA) r precipitable 
counts. 2 M l of diluted probe is spotted onto the center of a glass fiber filter disc 
(Whatman number 934-AH). The disc is then allowed to dry and is then placed 
in a vial containing scintillation cocktail for counting total radioactivity in a liquid 
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scintillation counter. Another 2 /d aliquot from the diluted probe is added to 1 
ml of 10% ice cold TCA followed by the addition of 2 jd of carrier bovine serum 
albumin (BSA). This mixture was then placed on ice for 10 minutes. The 
precipitate is then collected on a glass filter disc (Whatman No. 934-AH) by 
vacuum filtration. The filter is then washed with 20ml of ice cold 10% TCA, 
allowed to dry and is placed in a vial containing scintillation cocktail and counted. 

Because primer extension oligonucleotide labeling results in net 
DNA synthesis, the specific activity of labeled DNA is calculated using the 
following guidelines. 

Total cpm incorporated = TCA cpm X 50 X 27 

Wherein the factor 50 is derived from using 2 /d of a 1:100 dilution for TCA 
precipitation. The number 27 converts this back to the total reaction volume 
(which is the reaction volume plus 2 pi of stop solution). 

Synthesized DNA (ng of DNA synthesized) = 
theoretical yield X fraction of radioactivity incorporated. 

Theoretical yield (ng of DNA) = «Ci dNTPs added x 4 Y ^Onp /nmni,* 

specific activity dNTP(Ci/mmole=MCi/nmole) 

Fraction of incorporated label = TCA precipitated cpm/ total cpm. 

Specific activity (cpm/^g of DNA) = total com incorporated x inm 

synthesized DNA + input DNA 

Wherein 1000 is the factor converting nanograms to micrograms. 
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By way of example, the following represents the calculation of 
specific activity for an aliquot of pUC19 DNA labeled using this method. Using 
50 pCi of [or- 32 P]dCTP in a 25 M l reaction, and if the TCA precipitated cpm is 
26192 and total cpm is 102047; 

Total cpm incorporated « 26192 X 50 X 27 =3.27 x 10 7 cpm 
Synthesized DNA (ng of DNA synthesized) = 
Theoretical yield X fraction of radioactivity incorporated. 

Theoretical yield = ud of d NTT's x 4 x 3W 

3000 jtCi/nmole 

=50 ud x 4 y ~nn 
3000 

= 22 ng 

Fraction of label incorporated = TCA precip itated cpm = 26192 = 0.256 

Total cpm 102047 

Synthesized DNA = 22 X 0.256 
= 5.6 ng 

Specific activity (cpm /uz)= Total cpm jncpjB Qiatgd x 1000 

Synthesized DNA +input DNA 

Input DNA = 10 ng 

Specific activity = 3.27 x IP 7 * 1000 
5.6+10 
=2.09 x 10 9 cpm//xg 

Unincorporated radioactive label may be removed using standard 
methods well known in the art. 
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Comparisons were made between PEL-RGO vs RPL under similar 
conditions, and it was observed that a detection limit of 100 fg was seen using 
PEL-RGO labeled DNA compared to a detection limit of 500 fg with RPL, using 
a radiolabeled probe. 

Example 12 

Thermal Cycle Labeling and Universal Thermal Cycle Labeling 

Thermal Cycle Labeling (TCL) is a method according to the present 
invention for efficiently labeling double-stranded DNA while simultaneously 
amplifying large amounts of the labeled probe. TCL of DNA requires two 
general steps: 1) generation of the sequence-specific oligonucleotides by CwJI* 
restriction of the template DNA; and 2) repeated cycles of denaturation, 
annealing, and extension in the presence of a thermostable DNA polymerase or 
a functional fragment thereof which maintains polymerase activity. Optimal 
results are obtained after 20 such cycles, which is best performed in an automated 
thermal cycling instrument such as a Perkin-Hmer Model 480 thermocycler. In 
conjunction with such an instrument, about 1.5 hr. is required to complete this 
protocol. If a thermal cycler is not available these reactions may be performed 
using heat blocks. As few as 5 cycles may yield probes with acceptable detection 
sensitivities. The generation of sequence specific oligonucleotides for use in this 
method may also be accomplished using the restriction endonuclease reagent 
CGase I described in Example 20 or the restriction endonuclease Aci I which has 
as a recognition sequence CCGC. 

Non-radioactive labeling of DNA using TCL is accomplished by 
mixing: 10 pg - 100 ng linearized template, 50 ng CwJI*-digested primers 
(prepared as described above), 1.5 M l 10X labeling buffer, 0.5 M l Thermus flavus 
DNA polymerase (5u/ M l) (Molecular Biology Resources, Inc., Milwaukee, 
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Wisconsin), 1 y,\ of ImM Biotin-ll-dUTP (Enzo Diagnostics, New York, New 
York), 1.5 yX each of dATP, dCTP, and dGTP (2 mM), and 1.0 M l 2mM dTTP. 

Radioactive labeling of DNA using TCL was accomplished by 
mixing 10 pg - 100 ng of CwJI generated primers, 10 pg-25 ng of linearized 
template, 1.5 yX of 10X labeling buffer, 5 pi of 32 P-dCTP (3000 Ci/mmole, 10 
nCVul or 40 yCVyl), 0.5 pi of Thermusflavus DNA polymerase (5u/pl), and 0.5 
(il each of dATP, dGTP, and dTTP (1 mM) was added. The reaction mix was 
brought to a volume of 15 yl with deionized H 2 0, overlaid with mineral oil and 
cycled through 20 rounds of denaturation, annealing and extension. A typical 
cycling regimen employed 20 cycles of denaturation at 91°C for 5 sec, annealing 
at 50°C for 5 sec and extension at 72°C for 30 sec. The reaction is then 
terminated by adding 1 yl of 0.5M EDTA, pH 8.0. The amplified, labeled probe 
is a very heterogeneous mixture of fragments, which appears as a smear when 
analyzed by agarose gel electrophoresis. 

Universal thermal cycle labeling (UTCL) is a method according to 
the present invention for efficiently labeling double-stranded DNA while 
simultaneously amplifying large amounts of labeled probe. UTCL is unique in that 
no sequence information is required regarding the template. The extension 
primers are suppled endogenously via the holo-enzyme of the thermostable DNA 
polymerase and any anonymous DNA template can be labeled by repeated cycles 
of denaturation, annealing, and extension in the presence of a labeled 
deoxynucleotide triphosphate. Optimal results are obtained after 20 such cycles, 
which is best performed in an automated thermal cycling instrument such as a 
Perkin-Hmer Model 480 thermocycler. In conjunction with such an instrument, 
about 1.5 hr are required to complete this protocol. If a thermal cycler is not 
available these reactions may be performed using heat blocks. As a few as 5 
cycles may yield probes with acceptable detection sensitivies. 

Non-radioactive labeling of DNA using UTCL is accomplished by 
mixing: 10 ng linearized template, 1.5 yl 10X labeling buffer, 0.5 yl Thermus 
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flavus DNA polymerase (5u/ M l) (Molecular Biology Resources, Inc., Milwaukee, 
Wisconsin), 1 „1 of ImM Biotin-1 1-dUTP (Enzo Diagnostics, New York, New 
York), 1.5 M l each of dATP, dCTP, and dGTP (2 mM), and 1.0 h \ 2mM dTTP. 

Radioactive labeling of DNA using TJTCL was accomplished by 
mixing: 10 pg-100 ng of linearized template, 1.5 M l of 10X labeling buffer, 5 M l 
of 32 P ^cTP (3000 Ci/mmole) 1Q ^ or 4Q Q 5 ^ of Themm ' flavus 

DNA polymerase (5u/,d), and 0.5 M l each of dATP, dGTP, and dTTP (1 mM) 
was added. The reaction mix was brought to a volume of 15 M l with deionized 
H 2 0, overlaid with mineral oil and cycled through 20 rounds of denaturation, 
annealing and extension. A typical cycling regimen employed 20 cycles of 
denaturation at 91°C for 5 sec, annealing at 50°C for 5 sec and extension at 72°C 
for 30 sec. The reaction is then terminated by adding 1 ul of 0.5M EDTA, pK 
, 8.0. The amplified, labeled probe is a very heterogeneous mixture of fragments, 
which appears as a smear when analyzed by agarose gel electrophoresis. 

Estimation of Bio-ll riTITP i ncorporation; 

In order to estimate the level of incorporation of biotin-1 1-dUTP 
into DNA, a serial dilution from 1:10 to 1:10 s of the labeled probe (free of 
unincorporated biotin-1 1-dUTP) is made in TE (lOmM Tris, ImM EDTA, pH 8). 
A microliter of each dilution is placed on a neutral nylon membrane, and the 
DNA sample is bound to the membrane either by uv cross linking for 3 min or 
by baking at 80°C for 2 nr. 

The unbound sites on the membrane are blocked using a blocking 
buffer for 15 min at 25°C. Streptavidin-alkaline phosphatase (Gibco-BRL 
Gaithersburg, Maryland, Cat. No. 9545A) is added to the blocking buffer (0 058 
M Na 2 HP0 4 , 0.017 M NaH 2 P0 4 , 0.068 M Nad, 0.02% sodium azide, 0.5% 
casein hydrolysate, 0.1 % Tween-20) at a 1:5000 dilution and incubated for a 30 
min., and the membrane is rinsed 3 times for 10 min. each with wash buffer (lx 
PBS [0.058 M Na 2 HP0 4 , 0.017 M NaH 2 P0 4 , 0.068 M NaCl], 0.3% Tween, 
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0.2% sodium azide), rinsed briefly (5 minutes) with AP buffer (100 mM NaCl, 
5 mM MgCl 2> 100 mM Tris-Cl pH 9.5) and then enough AP buffer containing 
4.0 /xl/ml nitro blue tetrazolium (NBT) (Sigma Cat No. N6639), (Sigma Cat. No. 
B6777), and 3.5 /tl/ml of 5-bromo-4-chloro-3-indolyl phosphate (BCIP) was added 
in order to cover the membrane. The membrane is left in the dark for 
approximately 30 minutes or until the reaction is complete. The reaction is 
stopped by rinsing in 1 X PBS. 

Detection Sensitive 

32 

P-labeled probes generated by the protocol above described 
labelling detect as little as 25 zeptomoles (2.5 x 10" 20 moles) of a target 
sequence. As tittle as 10 pg of template DNA is enough to synthesize 5-10 ng of 
radiolabeled probe, which is sufficient for screening 5 Southern blots. The 
radioactive versions of TCL and UTCL focilitate extremely high specific activities 
of labeled probe (about 5 x 10 9 cpm/ M g DNA), which permits 5-10 fold lower 
detection limits than conventional labeling protocols. The synthesis of higher 
specific activity probes is probably the net result of the sequence-specific 
oligonucleotide primers and their increased length when compared to the short 
random primers used in other labeling methods. In addition, the thermal cycling 
permits probe amplification. 

Biotin-labeled probes generated by the TCL and UTCL protocols 
detect as tittle as 25 zeptomoles (2.5 x lO' 20 moles) of a target sequence. A 15 
fil TCL or UTCL reaction yields as much as 5-10 /xg of labeled DNA, enough to 
probe 5 to 10 Southern blots. Biotin-labeled TCL and UTCL probes provide a 
10 fold greater detection sensitivity when compared to RPL biotin probes. In 
addition, the thermal cycling permits probe amplification. 

Non-radioactive, biotinylated probes labeled by the TCL and UTCL 
methods were shown to have detection limits that are identical to the radioactive 
probes. These methods have the advantage of eliminating the need to work with 
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hazardous radioactive materials without sacrificing sensitivity. In addition, results 
are obtained from non-isotopic probes in 3-4 hours compared to 3-4 days for 
radiolabeled probes. The ability to substitute non-radioactive probes for 
radioactive probes may be very useful to clinical laboratories, which do not use 
5 radioisotopes but do need greater detection sensitivities. Research laboratories 
favor the use of non-isotopic systems if detection sensitivity is not an issue. The 
non-isotopic labeling version of the TCL and UTCL systems represent a major 
improvement in labeling DNA probes. Non-radioactive probes generated by the 
methods of the present invention are also useful in the detection of RNA in situ. 

10 An advantage of this system is that labeling protocols of the present invention 
yield highly sensitive non-radioactive probes, and the size of the probes are 
predominantly in the small molecular weight range and can therefore penetrate the 
tissue easily, unlike RPL. Because non-radioactive probes labeled using the 
labeling protocols of the present invention have the same detection limits as do 

15 radioactive probes similarly labeled, it is within the scope of this invention to use 
either radioactive or non-radioactive probes for probing, for example, Southern 
blots, Northern blots, for in situ hybridization for the detection of mRNA or DNA 
in cells or tissue directly, and for colony or plaque lifts. 

Example 13 

20 Quasi-Random Fragmentation of DNA 

Shotgun cloning and sequencing requires the generation of an 
overlapping population of DNA fragments. Therefore, conditions were 
established for the partial digestion of DNA with CwJI to produce an apparently 
random pattern, or smear, of fragments in the appropriate size range. 
25 Conventional methods for obtaining partially restricted DNA include limiting the 
incubation time or limiting the amount of enzyme used in the digestion. Initially, 
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agarose gel electrophoresis and ethidium bromide staining of the treated DNA 
were utilized to assess the randomness and size distribution of the fragments. 

Cm was obtained from CHIMERx (Madison, Wisconsin). 
Digestion of pUC19 DNA for limited time periods, or with limiting amounts of 
CvUl under normal or relaxed conditions, did not produce a quasi-random 
restriction pattern, or smear. Instead, a number of discrete bands were observed, 
as shown in Figure 7, lane 3 for the CvOl* partial digestion of pUC19. Complete 
digests of pUC19 under normal and CviJI* buffer conditions are shown in lanes 
1 and 2 respectively. These results show that, under these relaxed conditions, 
CViJI has a strong restriction site preference. 

To eliminate the apparent restriction site preferences observed 
under the partial restriction conditions described above, a series of altered reaction 
conditions were explored. Conditions of high pH, low ionic strength, addition of 
solvents such as glycerol or dimethylsulfoxide, and/or substitution of Mn 2+ for 
Mg 2+ were systematically tested with CviJI endonuclease using the plasmid 
pUC19. Figure 7 shows the results of these tests. In Lane M, a 100 bp DNA 
ladder was run. In Lanes 1-4, pUC19 DNA (1.0 M g) was run after digestion at 
37°C in a 20 /il volume for the following times and conditions: Lane 1, complete 
CviJI digest (1 unit of enzyme for 90 min in 50 mM Tris-HCl, pH 8.0, 10 mM 
MgCl 2 , 50 mM NaCl); Lane 2, complete CwJI* digest (1 unit of enzyme for 90 
min in 50 mM Tris-HCl, pH 8.0,10 mM MgCl 2 , 50 mM NaCl, 1 mM ATP, 20 
mM DTT); Lane 3, partial CviJI* digest (0.25 units of enzyme for 30 min in 50 
mM Tris-HCl, pH 8.0, 10 mM MgCl 2 , 50 mM NaCl, 1 mM ATP, 20 mM 
DTT); Lane 4, partial CviJI** digest (0.5 units of enzyme for 60 min in 10 mM 
Tris-HCl, pH 8.0,10 mM MgCl 2 , 10 mM NaCl, 1 mM ATP, 20 mM DTT, 20% 
v/v DMSO); and Lane 5, uncut pUC19 (1.0 fig). 

The digestion condition which yielded the best "smearing" pattern 
was obtained when the ionic strength of the relaxed reaction buffer was lowered 
and an organic solvent was added (Figure 7, lane 4). Plasmid pUC19 partially 
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digested under these conditions yields a relatively non-discrete smear. This 
activity is referred to as CVtfl" to differentiate it from the originally- 
characterized star activity described in Xia et al. , Nucl. Acids Res. 15:6075-6090 
(1987). The appearance of diffuse, faint bands overlying a background smear 
generated from this 2686 bp molecule indicates that some weakly preferred or 
resistant restriction sites may bias the results of subsequent cloning experiments. 

DNA was mechanically sheared by sonication utilizing a Heat 
Systems Ultrasonics (Farmingdale, New York) W-375 cup horn sonicator as 
specified by Bankier et al.. Methods in Enzymology 155:51-93 (1987). DNA 
fragmented by this method has random single-stranded overhanging ends (ragged 
ends). 

CWJI* digested, and sonicated samples were size fractionated by 
agarose gel electrophoresis and electroelution, or by spin columns packed with the 
size exclusion gel matrix, Sephacryl S-500 (Pharmacia LKB, Piscataway N.J.) to 
eliminate small DNA fragments. Spin columns (0.4 cm in diameter) were packed 
to a height of 1.3 cm by adding 1 ml of Sephacryl S-500 slurry and centrifuging 
at 2000 RPM for 5 minutes in a Beckman CPR centrifuge. The columns were 
rinsed 3 times with 1 ml aliquots of 100 mM Tris-HCl <pH 8.0) by centrifugation 
at 2000 RPM for 2 min. Typically, 0.2-2.0 „g of fragmented DNA in a total 
volume of 30 „1 was applied to the column. The void volume, containing those 
DNA fragments larger than 500 bp, was recovered in the column eluant after 
spinning at 2000 RPM for 5 minutes. The capacity of this microcolumn 
procedure is 2 »g of DNA. Agarose gel electrophoresis and electroelution are 
described in detail by Sambrook et al. Molecular Cloning: A Laboratory Manual, 
Second Edition Cold Spring Harbor Laboratory Press, Cold Spring Harbor N. Y.' 
(1989) and is well known to those skilled in the art. In these experiments, 5 ug 
of sample was pipetted into a 2 cm-wide slot on a 1% agarose' gel. 
Electrophoresis was halted after the bromophenol blue tracking dye had migrated 
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6 cm. Fragments larger than 750 bp, as judged by molecular size markers, were 
separated from smaller sizes and electrophoresed onto dialysis tubing (1000 MW 
cutoff). The fractionated material was extracted with phenol-chloroform and 
precipitated using ice cold ethanol (50% final volume) and ammonium acetate (2.5 
M final concentration). 

The ragged ends of the sonicated DNA were rendered blunt 
utilizing two different end repair reactions. In one end repair reaction (ER 1) 
sonicated DNA was treated according to the procedure outlined by BanMer et al. 
Methods in Enzymology 155:51-93 (1987), where 2.0 fig of sonicated lambda 
DNA is combined with 10 units of the Klenow fragment of DNA polymerase I, 
10 units T4 DNA polymerase, 0.1 mM dNTPs, (deoxynucleotide. 
triphosphates =deoxyadenosine triphosphate, deoxthymidine triphosphate, 
deoxycytosine triphosphate, and deoxyguanosine triphosphate) and reaction buffer 
(50 mM Tris-HCl, pH 7.5,10 mM Mgd 2 , 10 mM DTT). This mixture was 
incubated at room temperature for 30 min followed by heat denaturation of the 
enzymes at 65°C for 15 minutes. In a second end repair reaction (ER 2), an 
excess of the reagents and enzymes described above were utilized to ensure a 
more efficient conversion to blunt ends. In this reaction, 0.2 fig of the sonicated 
lambda DNA sample was treated under the same reaction conditions described 
above. 

Figure 8 shows comparisons of the size distributions of sonicated 
DNA versus DNA that was partially digested with CwTI**. In Lanes M, a 1 kb 
DNA ladder was run. In Lanes 1-3, untreated X DNA (0.25 fig), sonicated X 
DNA (1.0 fig), and CwJI** partially-digested X DNA (1.0 fig) were run, 
respectively. In Lanes 4-6, untreated pUC19 (0.25 fig), sonicated pUC19 (1.0 
fig), and CwJI** partially-digested pUC19 (1.0 fig) were run, respectively. 

Fragmentation of a large substrate such as lambda DNA (45 kb) 
revealed essentially no banding differences between the CwJI** method and 
sonication, as demonstrated in Figure 8, lanes 2 and 3. In addition, pUC19 DNA 
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that was partially digested with Cvijf* gave a size distribution or "smear" that 
closely resembled that achieved with sonication (Figure 8, lanes 5 and 6). As 
expected, the minor bias evident with a small molecule such as pUC19 was not 
detectable with a larger substrate such as lambda DNA. 

The intensity and duration of sonic treatment affects the size 
distribution of the resulting DNA fragments. The results obtained from the 
sonication of lambda and pUC19 samples (Figure 8) were obtained from three 20 
second pulses at a power setting of 60 watts. Sonication-generated smears are 
similar, although the size distribution of fragments is consistently greater with 
CwJI fragmentation. This result favors the cloning of larger inserts, which 
facilitates the efficiency of end-closure strategies (Edwards et al, Genome 6:593- 
608 (1990)). The size distribution of the DNA fragmented by CvzJI** is 
controlled by incubation time and amount of enzyme, variables which are readily 
optimized by routine analysis. An excess of enzyme or a long incubation time 
15 will completely digest pUC19 DNA, resulting in fragments which range in size 
from approximately 20 bp to approximately 150 bp (Figure 7, lanes 1 and 2). 
The results shown in Figure 8 were obtained by incubating pUC19 for 40 minutes 
and lambda DNA for 60 minutes with 0.33 units of CviJI/ M g substrate. The 
efficiencies of the two methods for randomly fragmenting DNA were 
quantitatively analyzed for use in molecular cloning, as described below. 

Example 14 

Rapid DNA Size Fractionation Utilizing Spin Column Chromatography 

The amount of data obtained by the shotgun sequencing approach 
is substantially increased if fragments of less than 500 bp are ehminated prior to 
25 the cloning step. Small fragments yield only a portion of the sequence data which 
may be collected from polyacrylamide gel based separations and, thus, such small 
fragments lower the efficiency of this strategy. Agarose gel electrophoresis 
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followed by electroelution is commonly used to size fractionate DNA prior to 
shotgun cloning (Bankier et a!., Methods in Enzymol. 155:51-93 (1987)). 
Approximately three hours arc required to prepare the agarose gel, electrophorese 
the sample, electroelute fragments larger than 500 bp, perform phenol-chloroform 
extractions, and precipitate the resulting material. 

The results of 5 out of 9 independent trials size-fractionating 
CviU -fragmented lambda DNA by agarose gel electrophoresis are shown in 
Figures 9A-E. Figures 9A-D illustrate the following. In Figure 9A: Lane M, 
1 kb DNA ladder; lane X, untreated X DNA (0.25 ^g); lane 1, unfractionated 
(UF) CwJI partially-digested X DNA (1 .0 tig); lane 2, column-fractionated (CF) 
CwJI** partially-digested X DNA (1.0 M g); lane 3, gel-fractionated (GF) CwJI** 
partially-digested X DNA (1.0 fig); and in Figures 9B-E are additional trials of the 
same treatments as in the lanes of Figure 9A which have the same label. 

Small DNA fragments may also be removed by passing the sample 
through a short column of Sephacryl S-500. Approximately 15 min. are needed 
to prepare the column and 5 min. to fractionate the DNA by this method. 

The results of three out of nine trials using a Sephacryl S-500 
column are shown in Figures 9A-C. The efficiency of eliminating small DNA 
fragments (<500 bp) by spin column chromatography appears high, and the 
reproducibility was excellent. This result is in contrast to the agarose gel 
electrophoresis and electroelution data presented in Figures 9A-E wherein nine 
replicate trials of this method yielded nine differently sized products, regardless 
of the source of the agarose. Both methods yielded 30-10% recoveries as 
measured by UV spectrophotometry. To quantitate the relative efficiencies of the 
two fractionation methods, the lambda DNA size fractionated in Figure 9A lanes 
2 and 3, and Figure 9B lane 3 were analyzed for cloning efficiency and insert 
size, as described below. 
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Example 15 
Cloning Efficiencies of Gel Elution and 
Chromatography Fractionation Methods 



10 



15 



20 



25 



The efficacy of size selection was quantified by two criteria: 1) by 
comparing the relative cloning efficiency of CwJI** partially-digested lambda 
DNA fragments fractionated either by agarose gel electrophoresis and 
electroelution or micro-column chromatography, and 2) determining the size 
distribution of the resulting cloned inserts. To reduce potential variables, large 
quantities of the cloning vector and ligation cocktail were prepared, ligation 
reactions and transformation of competent E. coli were performed on the same 
day, numerous redundant controls were performed, and all cloning experiments 
were repeated twice. Ligation reactions were carried out overnight at 12°C in 20 
Ml mixtures using the following conditions: 25 mM Tris-HCl (pH 7.8), 10 mM 
MgCl 2 , 1 mM DTT, 1 mM ATP, DNA, and 2000 units of T4 DNA ligase. For 
unfractionated samples, 10 ng of fragments and 100 ng of ff/ncll-restricted, 
dephosphorylated pUC19 were combined under the above conditions. For 
Sephacryl S-500 fractionated samples, 50 ng of size-selected fragments were 
ligated with 100 ng of fli/icn-restricted, dephosphorylated pUC19. This increase 
in fractionated DNA was determined empirically to compensate for the lower 
concentration of "ends" resulting from the fractionation procedure and/or the 
lowered efficiency of cloning larger fragments. Ligation reaction products were 
added to competent E. coli DH5aF' (*80d/acZAM15 A(/acZYA-arsF)U169 deoR 
07A96 recAl relM endAl thi-l fcrfR17(r K -,m K +) supEM X-) in a 
transformation mixture as specified by the manufacturer (Life Technologies, 
Bethesda, Maryland) and aliquots of the transformation mixture were plated on 
T agar (Messing, Methods in Enzymol. 101:20-78 (1983)) containing 20 M g/ml 
ampicillin, 25 fd of a 2% solution of isopropylthiogalactoside (IPTG) and 25 M l 
of a 2% solution of 5-dibromo-4-chloro-3-indolylgalactoside (X-GAL). The 
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cloning efficiencies reported are the average of triplicate platings of each ligation 
reaction. The concentration of the fractionated material was checked 
spectrophotometrically so that 50 ng was added to all ligation reactions. This 
material was ligated to ffwcll-digested and dephosphorylated pUC19. This 
cloning vector was chosen because it permits a simple blue to white visual assay 
to indicate whether a DNA fragment was cloned (white) or not (blue) (Messing, 
Methods in Enzymol. 101:20-78 (1983)). 

A summary of the cloning efficiencies calculated from two 
independent trials is given in Table 3. 
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TABLE 3 

Cloning Efficiencies of CviJI** Partially Digested Lambda DNA 
Fractionated by Microcolumn Chromatography Versus Agarose Gel 
Electroelution. 

Trial I Trial n 



Colony Phenotyp e 



DNA/treatment 


Blue 


White 


Blue 


White 


Supercoiled pUC19 


55000 


<10 


50000 


<10 


pUC19/HincII/CIAP 


210 


<1 


320 


1 


pUC19/HincII/CIAP/ 


150 


4 


210 


. n 

« 


T4 DNA ligase 










X/CviJI** partiaiyCF 


140 


240 


210 


240 


+ pUC19 










X/CviJI** partial/GFEl 


98 


49 


200 


18 


+ pUC19 










X/CviJI** partiaI/GFE2 


82 


54 


95 


74 



+ pUC19 

Cloning efficiencies reflect the number of ampicillin-resistant 
colonies/ng pUC19 DNA. CIAP represents treatment with calf intestinal alkaline 
phosphatase used to dephosphorylate SMI-digested pUC19 to ininimize self- 
ligation. CF refers to DNA that was fractionated on Sephacryl S-500 columns as 
described above. GFE1 and GFE2 refer to two runs wherein DNA was 
fractionated by agarose gel electrophoresis and electroeluted. X refers to 
bacteriophage X DNA. 

These trials represent repeated experiments in which X DNA 
fragments generated by CWJI** partial digestion were ligated to flwcn-linearized, 
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dephosphorylated pUC19 and transformed into DH5a F' competent cells described 
above, The first three rows in Table 2 show controls performed to establish a 
baseline to better evaluate the various treatments. Supercoiled pUC19 transforms 
E. coli 10 times more efficiently than the H/ncII-digested plasmid and 150-260 
times more efficiently than the #//icII-digested and dephosphorylated plasmid. 
The number of blue and white colonies which resulted from transforming HincE- 
cut and dephosphorylated pUC19 was determined both before and after treatment 
with T4 DNA ligase in order to differentiate these background events from 
cloning inserts. The background of blue colonies (which represent the uncut 
and/or non-dephosphorylated population of molecules) averaged 0.4%, compared 
to supercoiled plasmid. The background of white colonies (which presumably 
results from contaminating nucleases in the enzyme treatments or genomic DNA 
in the plasmid preparations) after tf/ncll-digestion, dephosphorylation, and ligation 
of pUC19 averaged 0.014% as compared to the supercoiled plasmid. 
15 The number of white colonies obtained when micro-column 

fractionated DNA was cloned into pUC19 was 240/ng vector in both trials. The 
efficiency of cloning gel fractionated and electroeluted DNA ranged from 18-74 
white colonies/ng vector. The data show that column fractionated DNA results 
in three to thirteen times the number of white colonies, and presumably 
recombinant inserts, as gel fractionated and electroeluted DNA. The size 
distribution of the inserts present in these white colonies is depicted in Figures 
10A-C. In Figure 10A, a CwJI** partial digest of 2 M g of X DNA was size 
fractionated on a 4 mm by 13 mm column of Sephacryl S-500 at 2,000 x g for 5 
minutes. The void volume containing partially digested DNA was directly ligated 
25 to linear, dephosphorylated pUC19 and 43 resulting clones were analyzed for 
insert size. The DNA for this experiment is the same as that shown in Figure 
9A, lane 2. In Figure 10B, a CvJI** partial digest of 5 fig of X DNA was size 
fractionated by agarose gel electroelution. The eluted DNA was phenol-extracted 
and ligated to linear, dephosphorylated pUC19, and the resulting 40 clones were 



20 
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analyzed for insert size. The DNA for this experiment is the same as that shown 
in Figure 9A, lane 3. In Figure IOC, the procedure is the same as in Figure 9B, 
except the DNA for this experiment came from Figure 9B, lane 3. 

A total of 43 random clones obtained from micro-column 
chromatography fractionation were analyzed for insert size (as shown in Figure 
10A). Most of these inserts were larger than 500 bp (37/43 or 86%), 11.6% 
(5/43) were smaller than 500 bp, and one clone (2.3%) was smaller than 250 bp. 
The average insert size was 1630 bp. These results are in contrast to those 
obtained by agarose gel fractionation (as shown in Figures 10B and 10C). In the 
first trial (Figure 10B) most of the inserts were smaller than 500 bp (26/37 or 
70.3%) and only 29.7% (11/37) were larger than 500 bp in size. In the second 
trial (Figure 10C) all of the inserts (40 total) were smaller than 500 bp. Thus, 
the use of agarose gel electroelution for the size fractionation of DNA results in 
unexpectedly variable and low cloning efficiencies. 



° Example 16 

Cloning Sonicated and CvOI**-Digested Lambda DNA 

To compare the cloning efficiencies of sonicated and Cvz'JI**- 
digested nucleic acid, X DNA was fragmented by each of these methods and 
ligated to pUC19 which was linearized with HincU and dephosphorylated to 
20 minimize self-ligation. 

DNA fragmented by CwJI** digestion and sonication was cloned 
both before and after Sephacryl S-500 size fractionation. Sonicated lambda DNA 
was subjected to an end repair treatment prior to ligation. Ligations were 
performed as described in Example 11. One-tenth of the ligation reaction (2 M l) 
25 was utilized in the transformation procedure, and the fraction of nonrecombinant 
(blue) versus recombinant (white) colonies was used to calculate the efficiency of 
this process. 
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The efficacy of the methods was quantified by comparing the 
cloning efficiency of lambda DNA fragments generated either by sonication or 
CviJl * partial digestion. To reduce potential cloning differences based on size 
preference, the size distribution of the DNA generated by these two methods was 
closely matched. Other experimental details were designed to reduce potential 
variables, as described above. Certain variables were unavoidable, however. For 
example, the sonicated DNA fragments required an enzymatic step to repair the 
ragged ends as described in Example 1 prior to ligation, whereas the CviJI** 
digests were heat-denatured and directly ligated to HincO. digested pUC19. 

A summary of the cloning efficiencies calculated from two 
independent trials is given in Table 4, section A (unfractionated samples), and 
Section B (fractionated samples). 
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Cloning efficiencies represent the number of ampicillin-resistant 
colonies/ng pUC19 DNA. CIAP indicates treatment with calf intestinal alkaline 
phosphatase. ER 1 and ER 2 are end repair methods described in Example 13. 
X refers to bacteriophage lambda. 

The indicated trials represent repeated experiments in which two 
identical sets of lambda DNA fragments generated by AM complete digestion, 
CviJI partial digestion, or sonication were each ligated to ffwcn-lmearized, 
dephosphorylated pUC19 and transformed into DH5aF' competent cells. The 
cloning efficiencies reported are the average of triplicate platings of each ligation 
reaction. In case the Sephacryl S-500 size fractionation step introduced inhibitors 
of ligation or transformation or resulted in differences attributable to the size of 
the material, the sonicated and CvUl** -digested samples were ligated with pUC19 
both prior to (A) and after (B) the fractionation steps. The first three rows in 
Table 4, sections A and B, are controls performed to establish a baseline to better 
evaluate the various treatments. These data show that supercoiled pUC19 
transforms E. coli 200-1000 times more efficiently than the iK/zcD-restricted and 
dephosphorylated plasmid. Without this dephosphorylation step, the cloning 
efficiency is 10% that of the supercoiled molecule (data not presented). The 
background of blue colonies averaged 0.5% in these experiments, compared to 
supercoiled plasmid, while the background of white colonies averaged 0.005%. 

A comparison of the data from unfractionated versus fractionated 
samples in Table 4, sections A and B, reveals a general decline in the number of 
white and blue colonies obtained after sizing. This decrease is primarily due to 
the fact that cloning efficiencies are dependent upon the size of the fragment, 
favoring smaller fragments and thus giving higher efficiencies for the 
unfractionated material. This is illustrated by comparing the efficiency of cloning 
unfractionated and fractionated X DNA which was completely restricted with AM. 
This four base recognition endonuclease produces blunt ends and cuts X DNA 
(48,502 bp) at 143 sites. Only 25 of the resulting 144 fragments (17%) are larger 
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than 500 bp. The number of white colonies obtained when unfractionated X 
DNA, completely restricted with AM, was cloned into pUC19 ranged from 250- 
400/ng vector, versus 23-48/ng vector for the fractionated material. This ten fold 
decrease was only noticed for the X Alu I digests, and probably reflects the large 
portion of small molecular weight fragments (approximately 75%) which is 
excluded from the fractionated ligation reactions. 

The number of white colonies obtained when unfractionated CViJI** 
treated X DNA was cloned into pUC19 ranged from 160-340/ng vector, versus 68- 
90 white colonies/ng vector if the same material was fractionated. Unfractionated 
X DNA, completely digested with^4M, results in cloning efficiencies very similar 
to unfractionated CvUl** treated DNA. Sonicated X DNA is a poor substrate for 
ligation, compared to CV£JI** treatment, as indicated by the roughly ten-fold 
reduced cloning efficiencies. 

Enzymatic repair of the ragged ends produced by sonication results 
in an increased cloning efficiency. Using conditions described in Example 13 for 
the first end repair treatment (ER 1), 10-44 (fractionated) and 19-32 
(unfractionated) white colonies/ng vector were observed. However, ER 1 
conditions may not be optimal, as an alternate end repair reaction (ER 2) (as 
described in Example 13) resulted in greater numbers of white colonies (63 and 
100/ng vector for fractionated and unfractionated DNA, respectively). In this 
reaction, a ten-fold excess of reagents and enzymes were utilized to repair the 
sonicated DNA, which apparently improved the efficiency of cloning such 
molecules by two to three fold. The data collected from multiple cloning trials 
in Table 3, sections A and B, show that CvOl** partial digestion results in three 
to sixteen times the number of white colonies than sonicated ER 1 -treated DNA. 
Even with an optimal end repair reaction for the sonicated fragments, DNA 
treated with CwJI** yielded three times more white colonies. 
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Example 17 

Analysis of CviJI** FragmentaUon for Shotgun Cloning and Sequencing 

The ability of CviJI** partial digestion to create uniformly 
representative clone libraries for DNA sequencing was tested on pUC19 DNA. 
pUC19 DNA was digested under CviJI** conditions and size fractionated as 
described above. The fractionated DNA was cloned into the EcoRV site of 
M13SPSI, a lacZ minus vector constructed by adding an EcoRV restriction site 
to wild type M13 at position 5605. M13SPSI lacks a genetic cloning selection 
trait, therefore after ligation of the pUC19 fragments into the vector the sample 
was restricted with EcoRV to reduce the background of nonrecombinant plaques. 
Bacteriophage M13 plaques were picked at random and grown for 5-7 hours in 2 
ml of 2XTY broth containing 20 yj of a DH5aF overnight culture. After 
centrifugation to remove the cells, single-stranded phage DNA was purified using 
Sephaglass™ as specified by the manufacturer (Pharmacia LKB, Piscataway New 
Jersey). The single-stranded DNA was sequenced by the dideoxy chain 
termination method using a radiolabeled M13-specific primer and Bst DNA 
polymerase (Mead et al, Biotechniques 11:76-87 (1991)). The first 100 bases of 
76 randomly chosen clones were sequenced to determine which CviJI recognition 
site was utilized, the orientation of each insert and how effectively the cloned 
fragments covered the entire molecule, as shown in Figure 11. The positions of 
the 45 normal CviJI sites (PuGCPy) in pUC19 are indicated beneath the line 
labeled "NORMAL" in the Figure 11. Similarly, the 160 CviJI* sites (GC) are 
indicated beneath the line labeled "RELAXED" in Figure 11. The marks above 
these tines indicate the CviJI** pUC19 sites which were found in the set of 76 
sequenced random clones. The frequency of cloning a particular site is indicated 
by the height of the line, and the left or right orientation of each clone is also 
indicated at the top of each mark. There are a total of 205 CviJI and CviJI* sites 
inpUC19. 
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The data presented in Figure 11 demonstrate that, under CwJI** 
partial conditions, normal Cw'JI sites are preferentially restricted over relaxed 
(CwJI*) sites. Of the 76 clones that were analyzed, only 13%, or 1 in 7, had 
sequence junctions corresponding to a relaxed CwJI* site. Thirty-five of the 
5 forty-five possible normal restriction sites were cloned, as compared to eight of 
the possible one hundred sixty relaxed sites. If the enzyme had exhibited no 
preference for normal or relaxed sites under the CwJI** partial conditions utilized 
here, then 78% of the sequence junctions analyzed should have been generated by 
cleavage at a relaxed CwJI* site. It may be noted that the relaxed CwJI* 

10 restriction sites that were found appear to be clustered in two regions of the 
plasmid that are deficient in normal CwJI sites. In addition, the combined 
distribution of the normal and relaxed sites which were restricted to generate the 
76 clones appears to be quasi-random. That is, the longest gap between cloned 
restriction sites was no greater than 250 bp and no one particular site is over- 

15 utilized. 

A detailed analysis of the distribution of CwJI** sequence junctions 
found from cloning pUC19 is presented in Table 5. 
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The GC sites in pUC19 may be divided into four classes based on 
their flanking Pu/Py structure. The fraction of GC sites observed in pUC19 which 
belong to each classification is roughly equal (22.0-27.8%). A striking difference 
was found between the observed distribution in pUC19 of normal and relaxed (Rl, 
R2, R3) CwJI recognition sites and the distribution revealed by shotgun cloning 
and sequence analysis of CwJI**-treated DNA. While most of the sites cleaved 
by this treatment were found to be PuGCPy (about 87%), or "normal" restriction 
sites, a significant fraction of the cleavage occurred at PyGCPy (about 6.5%) and 
PuGCPu (about 6.6%) sites, considering the short incubation times and limiting 
enzyme concentrations. The latter two categories of sites, and presumably the 
PyGCPu sites as well, are completely restricted under "relaxed" conditions, 
provided an excess of enzyme is present and sufficient time is allowed (see Figure 
7, and Xia et ai., Nucleic Acids Res. 15:6075-6090 (1987)). 

Digestion using CwJI** treatment results in a relatively even 
15 distribution of breakage points across the length of the molecule (as shown in 
Figure 11). As described above, Figure 11 depicts a linear map of pUC19 
showing the relative position of the lacZ' gene (a peptide of 0-galactosidase gene) 
and ampicillin resistance gene (Amp). The marks extending beneath the top tine 
(labeled "NORMAL") show the relative position of the 45 normal CwJI sites 
(PuGCPy) present in pUC19. The marks above the line are the cleavage sites 
found from sequencing the CviJl** partial library. The height of the line 
indicates the number of clones obtained from cleavage at that site, and the 
orientation of the flag designates the right or left orientation of the respective 
clone. The marks extending beneath the second line (labeled "RELAXED") show 
25 the relative positions of the 160 CwJI* sites (GC) present in pUC19. Those marks 
above the line were found from sequencing the CwJI** partial library. The 
bottom portion of Figure 1 1 shows the relative position and orientation of the first 
20 clones sequenced, assuming a 350 bp read per clone. CwJI** cleavage at 
relaxed sites appears to be important in "filling gaps" left by normal restriction. 
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The primary goal of this effort was to determine the efficacy of 
these methods for rapid shotgun cloning and sequencing. For these purposes, 
only 100 bases of sequence data were acquired per clone. However, if 350 bases 
of sequence had been determined from each clone, then the entire sequence of 
pUC19 would have been assembled from the overlap of the first 20 clones (Figure 
11). In this sequencing simulation 75% of pUC19 would have been sequenced 
at least 2 times from the first 20 clones. The highest degree of overfold 
sequencing would have been 6, and only involved 2.2% of the DNA. Figure 11 
also shows that most of the lx sequencing coverage occurred in a region of the 
plasmid with a very low density of normal and relaxed CV/JI restriction sites. 
Most of the single coverage occurs in a 240 bp region of the plasmid between 
1490 bp and 1730 bp where there are only 4 CvOI relaxed sites. It should also 
be noted that by the 27th randomly picked clone most of this region would have 
been covered a second time. 

Shotgun sequencing strategies are efficient for accumulating the 
first 80-95% of the sequence data. However, the random nature of the method 
means that the rate at which new sequence is accumulated decreases as more 
clones are analyzed. In Figure 12 the total amount of unique pUC19 sequence 
accumulated was plotted as a function of the number of clones sequenced. The 
points represent a plot of the total amount of determined pUC19 sequence versus 
the total number of clones sequenced. The horizontal dashed line demarcates the 
2686 bp length of pUC19. The smooth curve represents a continuous plot of the 
discrete function S(N)=NLe^W s -l)/c)+(l-s)]. The theoretical accumulation 
curve expected for a process in which sequence information is acquired in a 
totally random fashion is also shown. The smooth curve is a continuous plot of 
the discrete function S(N) where 

S(N)=r^^[(( e <*l)/c+(l-a)]. 
This equation is based upon the results developed by Lander et al, Genomics 
2:231-239 (1988) for the progress of contig generation in genetic mapping. In the 
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equation: N is the number of clones sequenced, L is the length of clone insert in 
bp, c is the redundancy of coverage or LN/G (where G is length of fragment 
being sequenced in bp), and a = 1-9, where 6 is the fraction of length that two 
clones must share. The curve in Figure 12 was calculated with G = 2686 bp, L 
= 350 bp, and a = 1. The plotted points lie close to the theoretical curve, and 
it thus appears that the sequence of pUC19 was accumulated in an apparent 
random fashion utilizing CwJI** fragmentation and column fractionation. 



Example 18 

Shotgun Cloning Utilizing 200 ng of Lambda DNA 



Generally, 2-5 /xg of DNA are needed for the sonication and 
agarose gel fractionation method of shotgun cloning in order to provide the 
several hundred colonies or plaques required for sequence analysis (Bankier et al. 
Methods in Enzymol. 155:51-93 (1987)). A ten-fold reduction in the amount of 
substrate required greatly simplifies the construction of such libraries, especially 
15 from large genomes, (Davidson, J. DNA Sequencing and Mapping 1:389-394 
(1991)). The efficiency of constructing a large shotgun library from nanogram 
amounts of substrate was tested utilizing 200 ng of CVzJI**-digested lambda DNA. 
This material was column-fractionated as described previously. In this case, 1/2 
of the column eluant (15 M l containing 50 ng of DNA) was ligated to 100 ng of 
20 ifl/icll-digested and dephosphorylated pUC19 as described in Example 15. The 
cloning efficiencies of the control DNAs were similar to those reported in Tables 
2 and 3. The 50 ng cloning experiment yielded 230 white colonies per ligation 
reaction in one trial, and 410 white colonies per ligation reaction in a second trial. 
Thus, it should be possible to routinely construct useful quasi-random shotgun 
25 libraries from as little as 0.2 - 0.5 ng of starting material. 
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Example 19 
Epitope Mapping 



10 



CvOI* recognizes the sequence GC (except for PyGCPu) in the 
target DNA. Under partial restriction conditions the length of fragment may be 
controlled by incubation time. Epitope mapping using CwJI** partial digests 
involves generating DNA fragments of 100-300 bp from a cDNA coding for the 
protein of interest, by methods described in Example 13, inserting them into an 
M13 expression vector, plating out on solid media, lifting plaques onto a 
membrane, screening for binding to the ligand of interest, and picking the positive 
plaques for isolation of the DNA, which is then sequenced to identify the epitope. 
Thus, the same epitope may be expressed as a small fragment or a larger 
fragment. This approach allows one to determine the smallest fragment 
containing the epitope of interest using functional assays such as binding to an 
antibody or other ligand, or using a direct assay for activity. For insertion into 
15 an M13 vector, linkers may be added to the fragments or the insert may be 
dephosphorylated to ensure that each fragment is cloned alone without ligation of 
multiple inserts. 

The expression vectors recommended for subcloning of the CwJI 
fragments are Lambda Zap™ (Stratagene, LaJolla, California) or bacteriophage 
M13-epitope display vectors. An advantage of using an M13-based vector is that 
the peptide or protein of interest may be displayed along with the M13 coat 
protein and does not require host cell lysis in order to analyze the protein of 
interest. The lambda-based vectors yield plaques and hence the protein can be 
directly bound to a membrane filter. 
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Example 20 
CGasel 

CGase I as used herein, refers to a restriction endonuclease reagent which 
cleaves DNA at the dinucleotide CG. CGase I activity is based on the combined 
star activities of the restriction endonucieases Hpa II and Taq I. Under normal 
reaction conditions (10 mM Bis Tris Propane-HCl pH 7.0, 10 mM MgCl 2 , 1 mM 
DTT; 1 unit of enzyme/jzg DNA, 37°C for 1 hr), Hpa H recognizes CCGG and 
cleaves after the first C to leave a 2-base 5' overhang. Under normal reaction 
conditions (100 mM NaCl, 10 mM Tris-HCl pH 8.4, 10 mM Mgd 2 , 10 mM 2- 
mercaptoethanol, 1 unit of enzyme/jig DNA, 65°C for 1 hr) the restriction 
endonuclease Taq I recognizes TCGA and cleaves after the T to leave a 2-base 
5' overhang. 

Reaction conditions have been described for Taq I* activity which decrease 
the cleavage specificity of Taq I (10 mM Tris-HCl pH 9.0, 5 mM MgCl 2 , 6 mM 
2-mercaptoethanol, 20% DMSO; 2000 units of enzyme/jig DNA, 65°C for 1 hr) 
(Barany, Gene, 65:149-165 (1988)). These reaction conditions allow Taq I* to 
cleave DNA at the following sequences: 

Taq I* TCGA 
CCGA (TCGG) 
ACGA (TCGT) 
TCTA (TAGA) 
TCAA (TTGA) 
GCGA (TCGC) 

We are unaware of any literature descriptions of Hpa H* conditions. 
However, the following conditions were established to promote Hpa n* activity 
which are also compatible with Taq I* activity: 5 mM KC1, 10 mM Tris-HCl pH 
8.5, 10 mM MgCl 2 , 1 mM DTT, 15% DMSO, 100 ug/ml BSA (CGase buffer); 
50 units of enzyme/ M g DNA 50°C for 1 hr. The Hpa D* recognition sites were 
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determined by cloning and sequencing Hpa H* restricted fragments. The 
characterized Hpa n* recognition sequences are as follows: 



Hpa H* CCGG 



CCGC (GCGG) 
CCGA (TCGG) 
ACGG (CCGT) 



Taq I (400 units/jig DNA) and Hpa II (50 units//xg DNA) were then 
combined (CGase I) in CGase I buffer and the following recognition sites were 
10 identified by cloning and sequencing restricted pUC19 fragments. 



CGase I GCGC 
TCGA 
CCGG 
GCGT 

15 ACGA 

ACGG (CCGT) 
GCGG (CCGC) 
CCGA (TCGG) 



CGase I restriction of natural DNA, (i.e. pUC19, lambda), results in fragments 
ranging from 20-200 bp in length (average 20-60 bp). Heat denaturation of these 
fragments generates numerous oligonucleotides of variable length but precise 
specificity for the cognate template as was the case with CviJ I* digestion. CGase 
I restriction of the small plasmid pUC19 (2689 bp) theoretically yields 174 
restriction fragments, or 384 oligonucleotides after a heat denaturation step. 

The "two-cutter" activity of CviJ I* and CGase I represent a unique class 
of restriction endonuclease activity in that no other known restriction 
endonucleases will generate this size range of oligonucleotides. The ability to 
generate numerous oligonucleotides with perfect sequence specificity from any 
DNA, without regard to sequence composition, genetic origin, or prior sequence 
30 knowledge is one of the properties that CGase I shares with CviJ I*. In addition, 
the generation of numerous oligonucleotides by CviJ I or CGase I results in a 
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form of probe or primer amplification not practical using conventional means of 
organic synthesis. 

Based on ability to recognize a dinucleotide sequence, the present invention 
contemplates the interchangeability of CGase I with CviJ I* in all of the 
applications described herein. 

Example 21 

Purification of CviJ I Restriction Endonuclease from 
IL-3A-Infected Chlorella Cells 

CviJ I was prepared by a modification of the method described by 
Xia et al., Nucl. Acids Res. 15:6025-6090 (1987). Chlorella NC64A cells 
(ATCC Accession No. 75399 deposited on January 21, 1993, American Type 
Culture Collection, Rockville, Maryland) were infected with the virus IL-3A 
(ATCC Accession No. 75354 deposited November 6, 1992, American Type 
Culture Collection, Rockville, Maryland) according to Van Etten et al.. Virology 
15 126: 1 17-125 (1983). Five grams of IL-3A infected Chlorella NC64A cells were 
suspended in a glass homogenization flask with 15 g of 0.3 mm glass beads in 
buffer A (10 mM Tris-HCl pH 7.9, 10 mM 2-mercaptoethanol, 50 M g/ml 
phenylmethylsulfonyl fluoride (PMSF), 20 ug/ml benzamidine, 2 M g/ml 0 - 
phenanthroline). Cell lysis was carried out at 4000 rpm for 90 sec in a Braun 
MSK mechanical homogenizer (Allentown, PA) with cooling from a COj tank. 
After lysis 2 M NaCl was added to a final concentration of 200 mM, after which 
10% polyethyleneimine (PEI) (Life Technologies, Bethesda, MD) (pH 7.5) was 
added to a final concentration of 0.3%. The mixture was then stirred for 2 hrs. 
at 4°C then centrifuged for 1 nr. at 50,000 g. Ammonium sulfate was added to 
25 the supernatant to 70% saturation and stirred overnight. A protein pellet was 
recovered by centrifugation for 1 hr. at 50,000 g. The resulting pellet was 
dissolved in 20 ml of buffer B (20 mM Tris-acetate pH 7.5, 0.5 mM EDTA, 10 
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mM 2-mercaptoethanol, 10% glycerol, 30 mM KC1, 50 ug/ml PMSF, 20 ^xg/ml 
benzamidine [Sigma, St. Louis, Missouri], 2 M g/ml o-phenanthroline [Sigma]) and 
dialysed against 500 ml of buffer B with 3 changes. The dialysed solution was 
then applied to 1 x 6 cm Heparin-Sepharose (Pharmacia LKB, Piscataway, New 
Jersey) column. After a 50 ml wash with buffer B, a 100 ml gradient of 0 to 0.7 
M KC1 in buffer B was run. Fractions having CviJ I activity as measured by 
digestion of pUC19 DNA and agarose gel electrophoresis, were pooled, diluted 
in 5 volumes of buffer C (10 mM K/P04 pH 7.4, 0.5 mM EDTA, 10 mM 2- 
mercaptoethanol, 75 mM NaCl,0.05% Triton X-100, 10% glycerol, 50 M g/ml 
PMSF, 20 /tg/ml benzamidine, 2 M g/ml o-phenanthroline) and applied to a 1 x 7 
cm Phosphocellulose PI 1 (Whatman) column equilibrated in buffer C. After 
washing with 30 ml of buffer C, CviJ I was eluted by a 100 ml gradient of 0 to 
0. / M NaCl in buffer C. At this step CviJ I activity separated from non-specific 
nucleases. CviJ I containing fractions were pooled and diluted in 4 volumes of 
buffer C and applied to a 1 x 4 cm hydroxyapatite HTP column (BioRad, 
Hercules, CA). After washing with 30 ml of buffer C, CviJ I was eluted by a 0 
to 0.7 M potasium phosphate (pH 7.4) gradient in buffer C. Active fractions 
containing CviJ I activity and lacking non-specific nuclease activity were pooled 
and were dialysed overnight against storage buffer (50 mM potassium phosphate 
200 mM KC1, 0.5 mM EDTA, 50% glycerol, 20 ug/ml PMSF were pooled) and 
stored at -20°C. 

Although the present invention has been described in types of 
preferred embodiments, it is intended that the present invention encompass all 
modifications and variations which occur to those skilled in the art upon 
25 consideration of the disclosure herein, and in particular those embodiments which 
are within the broadest proper interpretation of the claims and their requirements. 
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SEQUENCE LISTING 



(1) GENERAL INFORMATION; 

(i) APPLICANT: Molecular Biology Resources, Inc. 

(ii) TITLE OP INVENTION: Materials and Methods for 

Restriction Endonuclease Applications 
(iii) NUMBER OF SEQUENCES: 13 

(iv) CORRESPONDENCE ADDRESS: 

Inl 522252 SE 5in J**" 1 ** 11 ' °'Toole, Gerstein, Murray & Borun 

^ ™ ETS "°° Seara Tower ' 233 South w *cker Drive 

(C) CITY: Chicago 

(D) STATE: Illinois 

(E) COUNTRY: United States of America 

(F) ZIP: 60606-6402 

(v) COMPUTER READABLE FORM: 

(A) MEDIUM TYPE: Floppy disk 

(B) COMPUTER; IBM PC compatible 

(C) OPERATING SYSTEM: PC-DOS /MS-DOS 

(D) SOFTWARE: Patent In Release Version #1,25 

(vi) CURRENT APPLICATION DATA: 
<A) APPLICATION NUMBER: 

(B) FILING DATE: 

(C) CLASSIFICATION: 

(viii) ATTORNEY /AGENT INFORMATION: 

(A) NAME: Clough, David W. 

(B) REGISTRATION NUMBER: 36,107 

(C) REFERENCE / DOCKET NUMBER: 28003/31967/PCT 

(ix) TELECOMMUNICATION INFORMATION: 

(A) TELEPHONE: 312/474-6300 

(B) TELEFAX: 312/474-0448 

(C) TELEX: 25-3856 



(2) INFORMATION FOR SEQ ID NO:l; 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 44 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO:l: 
CAATTTCACA CAGGAAACAG CTATGTCTTT TCGCACGTTA GAAC 
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(2) INFORMATION FOR SEQ ID NO: 2; 



(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 5496 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA (genomic) 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO:2: 
ATGTCTTTTC GCACGTTAGA ACTATTCGCC GGTATAGCTG 
GGTATATCTA CACCAGTTGC ATTCGTAGAA ATTAATGAAG 
ACAAAGTTTT CAGATGCATC TGTATTCAAT GACGTTACGA 
CCAGAAGACA TAGACATGAT TACTGCGGGA TTCCCGTGCA 
TCTAGAACTG GATTCGAACA CAAGGAATCC GGTCTCTTTG 
GAAGAGTATA AACCTAAAAT AGTGTTTTTG GAAAACTCCC 
AATCTCGATG TCGTCGTAAA AAAGATGGAT GAAATTGGTT 
TGTCGGGCAT CAATTATAGG AGCCCATCAT CAACGCCACC 
CGAAAAGATT ATGAACCAGA AGAAATAATT GTATCTGTGA 
GAAAATAATG AACCACCGTG TCAAGTAGAC AATAAGAGTT 
CGTCTGGCAG GATATTCCGT GGTCCCCGAC CAGATCAGAT 
ACAGGTGATT TTGAGTCATC GTGGAAAACT ACCTTGACAC 
GAACACAAAA AAATGAAAGG AACTTACGAT AAAGTCATAA 
GTGTATTATT CTTTTTCAAG GAAAGAAGTT CATCGCGCTC 
CCACGTGATA TTCCGGAGAA ACATAACGGA AAAACACTCG 
AAATATTGGT GCACACCATG TGCTAGTTAT GGCACTGCTA 
ACAGACCGTC AGTCACATGC ACTTCCTACA CAAGTCAGGT 
GGACGACATT TGTCTGGTAT ATGGTGTGCA TGGTTGATGG 
GGTTATTTGG TTCAATATGA TTAAAATATT TTGATACACT 
CGTTTTACAA TAGAAGGGGC TAAACGTATA ATACTCGAAA 
AAAAGAATTG CGGAAGAGAA AAAAAGAATT GCACTTATAG 
GAGAAAAAAA GAATTGCGGA AGAGAAAAAA CGATTCGCAC 
GCGGAAGAAA AAAAACGAAT CGCGGAAGAG AAAAAACGAA 
CTTGCACTTA TAGAAAAACA ACGAATTGCG GAAGAGAAAA 



GTATTTCACA 
ACGCACAAAA 
AATTTACCAA 
CTGGGTTTAG 
CTGATGTTGT 
ATATGTTGTC 
ATTTCTGCAA 
GGTGGTTTTG 
ATGCTACAAA 
ACGAGAATTC 
ATGCTTTCAC 
CTGGGACAAT 
ACGGGTATTA 
CTCTAAATAT 
TAGATCGCGA 
CTGCTGGATG 
TTTCATATAG 
GGTATGACCA 
AAATGGATAT 
AAAAGAGACT 
AAAAACAACG 
TTGAAGAGAA 
TCGTGGAAGA 
TTGCGTCGGG 



TGGCCTCAGA 
ATTCTTGAAA 
ATCGGACTTC 
TATTGCAGGT 
GCGAATCACG 
CCACACTTAC 
GTGGGTAACT 
TCTCGCGATT 
GTTCGACTGG 
AACTCTTGTT 
CGGTCTATTT 
AATTGGCACG 
TGAGAACGAT 
ATCCGTGAAA 
AATGATCAAG 
CAATGTTCTG 
GGGTGTATGT 
AGAATATCTT 
AAGAAGAAAA 
TGAAGAGAAA 
AATTGCGGAA 
AAAACGAATT 
GAAAAAAAGA 
GAGAAAAATT 



60 
120 
180 
240 
300 
360 
420 
480 
540 
600 
660 
720 
780 
840 
900 
960 
1020 
1080 
1140 
1200 
1260 
1320 
1380 
1440 
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AGAAAGAGGA TCTCTACAAA TGCAACAAAA CATGAAAGAG AATTTGTCAA AGTTATAAAT 1SOO 
TCAATGTTCG TCGGACCCGC TACTTTTCTA TTCGTAGATA TAAAAGGTAA TAAATCCAGA 1560 
GAAATCCACA ACGTTGTAAG ATTCAGACAA TTACAAGGCA GTAAAGCGAA ATCCCCGACC 1620 
GCGTATGTTG ATAGAGAATA TAACAAACCT AAAGCGGATA TAGCAGCGGT AGACATAACC 1680 
GGTAAAGATG TGGCATGGAT ATCCCATAAA GCATCTGAAG GATATCAACA ATATCTAAAA 1740 
ATTTCTGGAA' AGAACCTCAA GTTCACAGGA AAAGAATTAG AAGAAGTTCT ATCGTTCAAG 1800 
AGAAAAGTAG TTAGTATGGC ACCGGTATCT AAAATATGGC CTGCTAATAA GACCGTATGG 1860 
TCTCCTATCA AGTCAAATTT GATTAAAAAT CAAGCAATAT TCGGATTTGA TTACGGTAAG 1920 
AAACCAGGAA GGGACAATGT AGACATCATA GGTCAAGGAC GACCAATTAT AACAAAAAGA 1980 

GGTTCCATAT TATATCTTAC ATTCACTGGT TTTAGCGCAT TAAATGGGCA CTTGGAGAAT 2040 

TTTACTGGGA AACATGAACC CGTTTTCTAT GTAAGAACAG AACGGAGTAG TAGCGGGAGA 2100 

AGTATAACAA CTGTCGTCAA TGGTGTCACT TATAAAAATT TAAGATTCTT TATACATCCA 2160 

TACAACTTTG TTTCTTCAAA AACACAACGT ATTATGTAGG ACCATTTTCC CGAGAGACTT 2220 

TGTTGACCGC GTACTAAAAA ATGGTCACGA TATTTGTCTA AAGATGCTCA TAGAAGCAGG 2280 

TGCAAACCTT GACATCGTCA GTGTTGAGTA TACACCATTA CATCTACATG TGGTGATATT 2340 

TGTATAAACG GTAAATACCT ATATATACAA TACGTATCCC CCTAAAAGCG CTTAGATTTT 2400 

TTAGTTGTAT ACTACTTTTG TATAAGACCT GTAAGTTACA AACTAAAAGT TTCAGCTTTG 2460 

CCTTCGAAAC AAGCAATTAC CGCATGAGAA TAATATCCAT TATGGATGTT TTCTGCTAAT 2520 

AAAACGATAT TTCCTACAGA AGTTTCTATG ATTAGTTCCG AAATATTGAG ATCATCGTCA 2580 

CGTTTTTCTT TACCGTATTT TACTTTCGTG ATCGTCGCAC CAATAAAATC ATCTCGTGTG 2640 

AGTTCATTCG GCAATTGTGC CGTGACACCA AATCTCTCAC AACAACCTTG ATGTCCATCC 2700 

ATTGCTAACA CTATCGGTAA TOCATGTGTG GTGTGTACGA CCACACCGTT ATAACTATAA 2760 

CACGTGTAGT TGTCGTCTAT ATCATATAAC TCGAGAGCGG TGTGAACTTC TTCAGATCTA 2820 

TTATTAATCG GATCTGATCC ATAAGAAGAA TCTTCATATT TACAAATAAA ATCATCCGAT 2880 

ATGTTCTGCA CACGAACAAC ATTCGTCAAA TTTCTGTGAT GACGAATCTC CATCTCTGAA 2940 

TCATTAGAGA CTTGCGAGTA TATAACATTA TAATTGTTGA TATGATTATT ACGTTTCATA 3000 

TCAACAAAAT ACATATAAAC ACCATACAAA TATTAAAACA CGTTAGTATA TAATGGATAA 3060 

CATTTGCAAT AGTATATTCA CTGCAGTAAA AAATGGCCAC GAAGCTTGTT TGAAGATGAT 3120 

GCTCATTGAA AGAGGTAGCA ATATCAATGA TGTTTCCGAA TCAAAATATG GAAATACACC 3180 

ACTACATATT GCAGCTCATC ATGGTAATGA TGTGTGTTTG AAGATGCTTA TTGACGCAGG 3240 

TGCAAACCTT GATATCACAG ATATTTCTGG AGGAACACCA CTTCATCGTG CGGTTTTGAA 3300 
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TGGCCATGAC 
AATTTGGGAT 
ATGCTCATCG 
CATTACGCGG 
AATCTTGACA 
CACGATGCAT 
ACTGAGTGGG 
CTCATTGAAG 
TACGCGGCTC 
ATCAACGCCG 
GCAGTATGTG 
GAGTTGTGTG 
CGGCTTCATG 
GATACTCTAC 
TAGTGTATTA 
AAATACAACA 
ATACGTACCT 
AGTAGTATCT 
GTTTTGTGCC 
CGATCTTATA 
AAGGTTATCC 
ATCTGTATAC 
CCAAGTTTAA 
ATGATATGAT 
ATGCATCGTT 
TGGATTAACT 
ACTGGGTGGT 
AACACCTTGA 
TTTCGTTGAA 
TTTCGTCGAA 
TTTCGTCGAA 



ATATTGTACA 
GGATACCGTT 
TTGTAAGTGA 
CTTTTAATGG 
TCACAGATAT 
GTGTGAAGAT 
TGCCGTTACA 
CAGGTGCAGA 
GAAATGGACA 
TCAACAAATC 
TGATCGTGAT 
TCATACCACC 
GGCGATCGGA 
GAACTACTGC 
ATTGAATGCG 
CGATCTTTTG 
CCAAATTCAT 
AAATTCAACC 
AATTTCACCT 
AGTATCTGCT 
CCAGAACCTG 
ATATCACTTG 
TACGGGGTCT 
GTGGTTAAAT 
ATACCTGGTG 
CGAGATTCGT 
ATGGCAGTTG 
GGGTTTACTT 
GGTGGTTTCG 
GGTGGTTTCG 
GGTGGTTTCG 



GATGCTCGTA 
ACATTACGCG 
TAATGTTGAC 
TCATAGCATG 
TTCGGGATGT 
ACTCGTAGAA 
TTACGCGGCT 
TATTGATATA 
CGATGTGTGT 
GGGGGATACA 
AGTCAATAAG 
AACGTCTGCT 
AGCTGCAAAG 
GTTGTGTTTG 
TGTAAAGTTA 
TAGATCGTTT 
TTACTTTACC 
CTTTGAACTC 
TCAAAACGAT 
TACTTCCAAG 
AAATTGTAAA 
GTTCGAAATG 
TTCCACCGAG 
CTCTATCACC 
CAACCTGTAC 
CAAATCTAAA 
CTGGAAGGGA 
GAATACTTCT 
TCGAAGGTGG 
TCGAAGGTGG 
TCGAAGGTGG 
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GAAGCAGGTG 
GCTTTTAATG 
GTTATCAATG 
TGCGTCAAGA 
ACACCACTTC 
GCAGGTGCAA 
TTTAATGGTA 
TCTAATATAT 
ATAAAAACAC 
CCACTAGATA 
ATCGTTTCGG 
GCATTAGGTG 
ATCACAGCGC 
AACCGAACAA 
CGCTATTTTT 
ACCATTAGTT 
TACAGTATTA 
ATCGCCATTA 
AGTAACCCAT 
TCCTTTTTCA 
GAACGACTGG 
AAAATCGTAG 
ACCGGACATT 
ATCGTTCCAC 
TAAATTCTTT 
ATATGATAAC 
AGGTAAAACT 
GGGAGATGTT 
TTTCGTCGAA 
TTTCGTCGAA 
TTTCGTCGAA 



CAAACCTTAG 
GTAATGATGC 
ATCGCGGTTG 
CGCTTATTGA 
ATCGTGCGGT 
CTCTTGACGT 
ATGATGCGAT 
GTGATTGGAC 
TCATCGAAGC 
TTGCAGCATG 
AGCGGCCGTT 
ATGTGTTGCG 
ATCTTCCTGT 
TTTCCGAGAG 
TTCCAAAAAG 
GTATTCGTGC 
CCACTTCCTT 
ACAGACAGAG 
TGACCTCTAG 
AAAGCATACA 
AAATGAATAG 
TCCCAATTAG 
TCAGCACGAG 
TTTCCTCTGA 
ATTTCAGGTG 
GATGTTCCAA 
TTAGGATATT 
GGTGGTTTCG 
GGTGGTTTCG 
GGTGGTTTCG 
GGTGGTTTCG 



TATCATAACT 
GATTTTGAGG 
GACGGCGTTA 
TGCGGGTGCA 
TTATAATGAC 
CATTGATGAT 
TTTGAGGATG 
GGCGTTACAT 
AGGTGGTAAC 
TCATGACATT 
GCGTCCGAGT 
AACGACGATG 
GGGTGCAAGG 
ATCTCGTTGA 
GGTTTGCATG 
AATAGAGACC 
TTTTTCCTAT 
CGTATGAACC 
GAATTTTAAC 
ACGATCCTGT 
GTTGCATTAG 
GTACGTTCCA 
CCTTGTAAGA 
ACCGAAGACC 
CGGCTCCGGG 
CAGTAGAACC 
TCAAATCACC 
TCGAAGGTGG 
TCGAAGGTGG 
TCGAAGGTGG 
TCGAAGGTGG 



3360 

3420 

3480 

3540 

3600 

3660 

3720 

3780 

3840 

3900 

3960 

4020 

4080 

4140 

4200 

4260 

4320 

4380 

4440 

4500 

4560 

4620 

4680 

4740 

4800 

4860 

4920 

4980 

5040 

5100 

5160 
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TTTCGTCGAA GGTGGTTTCG TCGAAGOTGG TTTCGTTGGC GGAAGTGGGG CATGACCATA 
ATCCGTTAAA TTCCCGCATT CACCTAATGA TGTACTCCAT AAAGAACCGG GTGCGCATTG 
CATTCTTATT GGTTCTGTAG TATCAGATAT ACATACGAAA TAATGAGAAT CATTTTCCCT 
GCCAAATAAT TTACCAGATT TGCCTTTACA TGACATTATT TGTAATATAA TATTATTATA 
ATTTTAAAAA AACTAACGTC TATTTAAAAT TATGTAATAC GTATTATATC AATGCATCAT 
CTTAATCATT TCCTAACGTA TAAGCGTAGC GAATTC 
(2) INFORMATION FOR SEQ ID NO: 3: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 1225 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA (genomic) 

(ix) FEATURE: 

(A) NAME/KEY: CDS 

(a) LOCATION: join{1..33, 55.. 1128) 
(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 3: 

G ^ ™ T GGT.TAT TTG GTT CAA TAT GAT TAAAATATTT TGATACACTA 

Gin Glu Tyr Leu Gly Tyr Leu Val Gin Tyr Asp TGATACACTA 53 

1 5 10 

A ult tul tT A ^ A AGA *** CGT TTT ACA ATA GAA GGG GCT AAA CGT 
Met Asp He Arg Arg Lys Arg Phe Thr He Glu Gly Ala Lys Arg 

15 20 25 

l2 S ^ ^ AGA CTT GAA GAG *** AAA AGA ATT GCG GAA 

He He Leu Glu Lys Lys Arg Leu Glu Glu Lys Lys Arg He Ala Glu 

JU 35 40 

GAG AAA AAA AGA ATT GCA CTT ATA GAA AAA CAA CGA ATT GCG GAA GAG 
Glu Lys Lys Arg He Ala Leu He Glu Lys Gin Arg S aS SJ 2!! 
« 50 55 

AAA AAA AGA ATT GCG GAA GAG AAA AAA CGA TTC GCA CTT GAA GAG AAA 
Lys Lys Arg He Ala Glu Glu Lys Lys Arg Phe Ala Leu 2S SX J£ 

65 7Q 

AAA CGA ATT GCG GAA GAA AAA AAA CGA ATC GCG GAA GAG AAA AAA CGA 
Lys Arg lie Ala Glu Glu Lys Lys Arg lie Ala Glu SK £J JJi J£g 

85 90 

ATC GTG GAA GAG AAA AAA AGA CTT GCA CTT ATA GAA AAA CAA CGA ATT 
He Val Glu Glu Lys Lys Arg Leu Ala Leu He oil J£ Si Sg He 
95 100 105 

A?f A F GCG TCG 000 AGA ^ ATT AGA AAG AGG ATC TCT 

Ala Glu Glu Lys He Ala Ser Gly Arg Lys He Arg Lys Arg He J£ 

1X0 115 120 



99 



147 



195 



243 



291 



339 



387 
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ACA AAT GCA ACA AAA CAT GAA AGA GAA TTT GTC AAA GTT ATA AAT TCA 435 
Thr Aan Ala Thr Lys His Glu Arg Glu Phe Val Lys Val He Asn Ser 
125 130 135 

ATG TTC GTC GGA CCC GCT ACT TTT GTA TTC GTA GAT ATA AAA GGT AAT 483 
Met Phe Val Gly Pro Ala Thr Phe Val Phe Val Asp He Lye Gly Asn 
140 145 150 

AAA TCC AGA GAA ATC CAC AAC GTT GTA AGA TTC AGA CAA TTA CAA GGC 531 
Lys Ser Arg Glu lie His Asn Val Val Arg Phe Arg Gin Leu Gin Gly 
155 160 165 170 

AGT AAA GCG AAA TCC CCG ACC GCG TAT GTT GAT AGA GAA TAT AAC AAA 579 
Ser Lys Ala Lys Ser Pro Thr Ala Tyr Val Asp Arg Glu Tyr Asn Lvs 
175 180 ; 8 5 

CCT AAA GCG GAT ATA GCA GCG GTA GAC ATA ACC GGT AAA GAT GTG GCA 627 
Pro Lys Ala Asp He Ala Ala Val Asp He Thr Gly Lys Asp Val Ala 
I 90 195 200 

TGG ATA TCC CAT AAA GCA TCT GAA GGA TAT CAA CAA TAT CTA AAA ATT 675 
Trp He Ser His Lys Ala Ser Glu Gly Tyr Gin Gin Tyr Leu Lys He 
205 210 215 

TCT GGA AAG AAC CTC AAG TTC ACA GGA AAA GAA TTA GAA GAA GTT CTA 

° er SiX LyS A8n Leu Lys Phe Thr G1 Y L y fl Giu Leu Glu Glu Val Leu 
220 225 230 

TCG TTC AAG AGA AAA GTA GTT AGT ATG GCA CCG GTA TCT AAA ATA TGG 
Ser Phe Lys Arg Lys Val Val Ser Met Ala Pro Val Ser Lys He Trp 
235 240 245 250 

CCT GCT AAT AAG ACC GTA TGG TCT CCT ATC AAG TCA AAT TTG ATT AAA 819 
Pro Ala Asn Lys Thr Val Trp Ser Pro He Lys Ser Asn Leu He Lvs 
255 260 265 

™l 5^ *?A ATA TTC GGA TTT GAT TAC GGT AAG AAA CCA GGA AGG GAC 867 
Asn Gin Ala He Phe Gly Phe Asp Tyr Gly Lys Lys Pro Gly Arg Asp 
270 275 280 

AAT GTA GAC ATC ATA GGT CAA GGA CGA CCA ATT ATA ACA AAA AGA GGT 
Asn Val Asp He He Gly Gin Gly Arg Pro He He Thr Lys Arg Gly 
285 290 295 

TCC ATA TTA TAT CTT ACA TTC ACT GGT TTT AGC GCA TTA AAT GGG CAC 

l}£ LeU Leu Thr Phe Thr Glv phe Ser Ala Leu Asn Gly His 

300 305 310 



723 



771 



915 



963 



TTG GAG AAT TTT ACT GGG AAA CAT GAA CCC GTT TTC TAT GTA AGA ACA 1011 
Leu Glu Asn Phe Thr Gly Lys His Glu Pro Val Phe Tyr Val Arg Thr 
315 320 325 * 330 

GAA CGG AGT AGT AGC GGG AGA AGT ATA ACA ACT GTC GTC AAT GGT GTC 1059 
Glu Arg Ser Ser Ser Gly Arg Ser He Thr Thr Val Val Asn Gly Val 
335 340 345 

ACT TAT AAA AAT TTA AGA TTC TTT ATA CAT CCA TAC AAC TTT GTT TCT 1107 
Thr Tyr Lys Asn Leu Arg Phe Phe He His Pro Tyr Asn Phe Val Ser 
350 355 3 6 o 
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TCA AAA ACA CAA CGT ATT ATG TAGGACCATT TTCCCGAGAG ACTTTGTTGA n tzo 

Ser Lye Thr Gin Arg lie Met ilbB 
365 

CCGCGTACTA AAAAATGGTC ACGATATTTG TCTAAAGATG CTCATAGAAG CAGGTGCAAA 1218 
CCTTGAC 

1225 

(2) INFORMATION FOR SEQ ID NO: 4: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 369 amino acids 

(B) TYPE: amino acid 
(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: protein 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 4: 

Gin Glu Tyr Leu Gly Tyr Leu Val Gin Tyr Asp Met Asp He Arg Arg 
1 5 10 is 

Lys Arg Phe Thr He Glu Gly Ala Lys Arg lie lie Leu Glu Lys Lys 

Arg Leu Glu Glu Lys Lys Arg He Ala Glu Glu Lys Lye Arg He Ala 
35 40 45 

Leu lie Glu Lys Gin Arg He Ala Glu Glu Lys Lys Arg He Ala Glu 
50 55 60 

Glu Lys Lys Arg Phe Ala Leu Glu Glu Lys Lys Arg He Ala Glu Glu 
65 70 75 so 

Lys Lys Arg He Ala Glu Glu Lys Lys Arg He Val Glu Glu Lys Lys 
85 90 95 

Arg Leu Ala Leu He Glu Lys Gin Arg He Ala Glu Glu Lys He Ala 
100 105 no 

Ser Gly Arg Lys He Arg Lys Arg He Ser Thr Asn Ala Thr Lys His 
US 120 125 

Glu Arg Glu Phe Val Lys Val He Asn Ser Met Phe Val Gly Pro Ala 

Thr Phe Val Phe Val Asp He Lys Gly Asn Lys Ser Arg Glu He His 
145 150 155 leo 

Asn Val Val Arg Phe Arg Gin Leu Gin Gly Ser Lys Ala Lye Ser Pro 
!65 170 175 

Thr Ala Tyr Val Asp Arg Glu Tyr Asn Lys Pro Lys Ala Asp He Ala 
180 185 190 

Ala Val Asp He Thr Gly Lys Asp Val Ala Trp He Ser His Lys Ala 
195 200 205 

Ser Glu Gly Tyr Gin Gin Tyr Leu Lys He Ser Gly Lys Asn Leu Lye 
210 215 220 
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Phe Thr Gly Lys Glu Leu Glu Glu Val Leu Ser Phe Lye Arg Lye Val 



230 



235 



240 



Val Ser Met Ala Pro Val Ser Lys He Trp Pro Ala Asn Lye Thr Val 
245 250 255 

Trp Ser Pro He Lys Ser Asn Leu He Lys Asn Gin Ala He Phe Glv 
260 265 270 

Phe Asp Tyr Gly Lys Lys Pro Gly Arg Asp Asn Val Asp He He Gly 
275 280 285 

Gl11 290 ^ Gly Ser - le LeU ** r Leu Thr 



295 



300 



Phe Thr Gly Phe Ser Ala Leu Asn Gly His Leu Glu Asn Phe Thr Gly 



310 



315 



320 



Lys His Glu Pro Val Phe Tyr Val Arg. Thr Glu Arg Ser Ser Ser Gly 
325 330 335 1 



Arg Ser He Thr Thr Val Val Asn Gly Val Thr Tyr Lys Asn Leu Arq 
340 345 350 y 

Phe Phe lie His Pro Tyr Asn Phe Val Ser Ser Lys Thr Gin Arg He 



360 



365 



Met 



(2) INFORMATION FOR SEQ ID NO: 5: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 17 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA 



<xi) SEQUENCE DESCRIPTION: SEQ ID NO: 5: 
GTAAAACGAC GGCCAGT 

17 

(2) INFORMATION FOR SEQ ID NO: 6: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 16 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 6: 
GCCAAGCTTG GATGAT 



16 
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(2) INFORMATION FOR SEQ ID NO: 7: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 31 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 7: 
ATCTTCGCGA ATTCACTGGC CGTCGTTTTA C 31 
(2) INFORMATION FOR SEQ ID NO: 8: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 14 base pairs. 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO:8: 
GAATTCGCGA AGAT 

14 

(2) INFORMATION FOR SEQ ID NO: 9: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 33 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 9: 
ATCATCCAAG CTTGGCACTG GCCGTCGTTT TAC 33 
(2) INFORMATION FOR SEQ ID NO: 10: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 81 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 10: 
GTAAAACGAC GGCCAGTGAA TTCGCGAAGA TNNNNNNNNN NNNNNNNNAT CATCCAAGCT 60 
TGGCACTGGC CGTCGTTTTA C Q1 
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(2) INFORMATION FOR SEQ ID NO: 11: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 81 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 11: 
GTAAAACGAC GGCCAGTGCC AAGCTTGGAT GATNNNNNNN NNNNNNNNNN ATCTTCGCGA 
ATTCACTGGC CGTCGTTTTA C 
(2) INFORMATION FOR SEQ ID NO: 12: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 270 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA (genomic) 



(ix) FEATURE: 

(A) NAME/KEY: CDS 

(B) LOCATION: join(26 . . 148, 190.. 207, 244.. 270) 



(Xi) SEQUENCE DESCRIPTION: SEQ ID NO: 12: 

TAACAATTTC ACACAGGAAA CAGCT ATG ACC ATG ATT ACG CCA AGC TCG AAA 52 

Met Thr Met lie Thr Pro Ser Ser Lys 



100 



TTA ACC CTC ACT AAA GGG AAC AAA AGC TGG TAC CGG GGC CCC CCC TCG 
Leu Thr Leu Thr Lys Gly Asn Lys Ser Trp Tyr Arg Gly Pro Pro Ser 
1U 15 20 25 

AGG TCG ACG GTA TCG ATA AGC TTG ATA AAC CAT TTA TAC AAT AAG CGT no 
Arg ser Thr Val Ser He Ser Leu He Asn His Lei Jyr !£n £e Sg 148 

TGATATAAGT TTGTATATAC GTCATTTCGT TATATCAACA A ATG TTA TCA TAT 201 

Met Leu Ser Tyr 
45 

TAT ACG TAAAACTGGC TTAAAAAAAA ACGAGGTGTA ACTATA ATG TCT TTT CGC 255 
y Met Ser Phe Arg 

50 

ACG TTA GAA CTA TTT 

Thr Leu Glu Leu Phe 270 
55 
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(2) INFORMATION FOR SEQ ID NO: 13: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 56 amino acids 

(B) TYPE: amino acid 
(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: protein 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 13: 



Met Thr Met He Thr Pro Ser Ser Lys Leu Thr Leu Thr Lye Gly Asn 

5 10 15 

Lye ser Trp Tyr Arg Gly Pro Pro Ser Arg Ser Thr Val Ser lie Ser 



Leu He Asn His Leu Tyr Asn Lys Arg Met Leu Ser Tyr Tyr Thr Met 
J5> 40 45 



Ser Phe Arg Thr Leu Glu Leu Phe 
50 55 
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INDICATIONS RELATING TO A DEPOSITED MICROORGANISM 

(PCTRulc \3bis) 



A. The indications made belqw relate to the mjcroorgarusm referred to in the description 
on page __ .line 13 



B. IDENTIFICATION OF DEPOSIT Further depos.ts are identified on an additional sheet |T 



Name of depositary institution 

American Type Culture Collection 



Address of depositary institution (including postal code and country) 

12301 Parklawn Drive 
Rockville, Maryland 20852 
UNITED STATES OF AMERICA . 



Date of deposit 

November 6 ; 1992 


Accession Number 

p t r r 75354 


C. ADDITIONAL INDICATIONS (leave blank if not applicable) This information is continued on an additional sheet Q 



"In respect of those designations in which a European patent is sought, 
a sample of the deposited microorganism will be made available until the 
publication of the mention of the grant of the European patent or until the 
date on which the application has been refused or withdrawn or is deemed to 
be withdrawn, only by the issue of such a sample to an expert nominated by 
the person requesting the sample (Rule 23(4) EPC)." 



D. DESIGNATED STATES FOR WHICH INDICATIONS ARE MADE (if the indication arena for aU designated States) 



E. SEPARATE FURNISHING OF INDICATIONS (leave bludt if net tppliciblt) 



For receiving Office use only 



This sheet was received with tbejni 
ttonzed officer / ^ 



teraational application 



For international Bureau use only 



I I This sheet was received by ifae International Bureau c 



Authorized officer 



Form PCT/RCVl34(July 1992) 
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INDICATIONS RELATING TO A DEPOSITED MICROORGANISM 

(PCT Rule I3bis) 



A. The indications made below relate to the microorganism referred to in the description 



on page 



79 



. line 



10 



B. IDENTIFICATION OF DEPOSIT 


Further deposits are identified on an additional sheet £x] 


Name of depositary institution 




American Type Culture Collection 




Address of depositary institution (including postal code and country) 


12301 Parklawn Drive 
Rockville, Maryland 20852 
UNITED STATES OF AMERICA . 




Date of deposit January 21, 1993 


Accession Number 

A.T.C.C. 75399 


C. ADDITIONAL INDICATIONS (leave blank if not applicable) This information is continued on an additional sheet Q 



"In respect of those designations in which a European patent is sought, 
a sample of the deposited microorganism will be made available until the 
publication of the mention of the grant of the European patent or until the 
date on which the application has been refused or withdrawn or is deemed to 
be withdrawn, only by the issue of such a sample to an expert nominated by 
the person requesting the sample (Rule 23(4) EPC)." 



D. DESIGNATED STATES FOR WHICH INDICATIONS ARE MADE (if the indications are not for all designated States) 



E. SEPARATE FURNISHING OF INDICATIONS (leave blank if not applicable) 



The indications listed below will be submitted to the International Bureau la ter {specify ike general nature of the indications e.j„ 'Accession 
Number of Deposit') 




For International Bureau use only 



[""] This sheet was received by the International Bureau on: 



Authorized officer 



Form PCT/RO/134(July 1992) 
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INDICATIONS RELATING TO A DEPOSITED MICROORGANISM 

(PCT Rule llbis) 



A. The indications made below relate to the microorganism referred to in the description 
on page 31 jj nc 25 



B. IDENTIFICATION OF DEPOSIT 



Further deposits are identified on an additional sheet ^ 



Name of depositary institution 



American Type Culture Collection 



Address of depositary institution (including postal code and country) 

12301 Parklawn Drive 
Rockville, Maryland 20852 
UNITED STATES OF AMERICA . 



Date of deposit 



June 30, 1994 



Accession Number 

A.T.C.C. 69341 



C ADDmONALIM)ICATIONSfW^^^a^^ mfofIIU(ion u on an addjlio|wJ sfacct j-j 



"In respect of those designations in which a European patent is sought, 
a sample of the deposited microorganism will be made available until the 
publication of the mention of the grant of the European patent or until the 
date on which the application has been refused or withdrawn or is deemed to 
be withdrawn, only by the issue of such a sample to an expert nominated by 
the person requesting the sample (Rule 23(4) EPC)." 



D. DESIGNATED STATES FOR WHICH INDICATIONS ARE MADE 



E. SEPARATE FURNISHING OF INDICATIONS (It** blank if not applicable) 



/ 



For receiving Office use only 



J2^T shcct was *cej v ed the international aj 




zed officer 



itb tbe intenutionalappfceariop 



For Internationa J Bureau use only 



Q This sheet was received by the International Bureau < 



Authorized officer 



Form PCT/RO/134 (July 1992) 
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WE CLAIM: 

1. A purified and isolated polynucleotide encoding a CviJl 
polypeptide or a variant thereof possessing activity characteristic of CvfJI, said 
polynucleotide comprising a polynucleotide as set out in SEQ ID NO: 2. 

2. The polynucleotide of claim 1 which is a DNA. 

3. The DNA of claim 2 which is a viral genomic DNA 
sequence or a biological replica thereof. 

4. The DNA of claim 2 which is a wholly or partially 
chemically synthesized DNA or biological replica thereof. 

5. A purified isolated DNA encoding a polypeptide according 
to claim 1 by means of degenerate codons. 

6. A vector comprising a DNA according to claim 2. 

7. The vector of claim 6 which is the plasmid pCJHl A (ATCC 
Accession No. 69341). 

8. A host cell stably transformed or transfected with a DNA 
according to claim 2 in a manner allowing the expression in said host cell of a 
CV/JI polypeptide or a variant thereof possessing a sequence specificity 
characteristic of CViJI. 

9. The host cell according to claim 8, wherein said host cell 

is £. coIL 
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10. A method for producing a CwJI polypeptide or a variant 
thereof possessing biological activity specific to CwJI, said method comprising the 
steps of: 

a) growing a transformed host cell containing a vector 
according to claim 6 in a suitable nutrient medium; and 

b) isolating the CvOl polypeptide or variant thereof from 

said host cell. 

11. The method of claim 10 wherein said host cell is E. colL 

12. A recombinant CV/JI polypeptide. 

13. A polypeptide produced by the method of claim 10. 

14. A method for restriction endonuclease digestion of DNA 
comprising the step of digesting DNA with a restriction endonuclease reagent 
under conditions wherein said DNA is cleaved at a dinucleotide sequence selected 
from the group consisting of PyGCPy, PuGCPy, PuGCPu, and wherein Pu = 
purine and Py = pyrimidine. 



15. A method for restriction endonuclease digestion of DNA 
comprising the step of digesting DNA with a restriction endonuclease reagent 
under conditions wherein said DNA is digested at 11 of 16 possible dinucleotide 
sequences and wherein said dinucleotide sequences are selected from the group 
consisting of PuCGPu, PuCGPy, PyCGPy and PyCGPu, and wherein Pu = 
purine and Py = pyrimidine. 



16. The method according to claim 14 wherein said restriction 
endonuclease reagent comprises CvLT I. 
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17. A restriction endonuclease reagent, said restriction 
endonuclease reagent comprising in combination, Taq I and Hpa H (CGase I), 
said reagent capable of digesting DNA at 11 of 16 possible dinucleotide 
sequences, said sequences selected from the group consisting of PuCGPu, 
PuCGPy, PyCGPy and PyCGPu, and wherein Pu = purine and Py = pyrimidine. 

18. The method according to claim 15 wherein said restriction 
endonuclease reagent is selected from the group consisting of Aci I and CGase I. 



19. The method according to claim 16 wherein said digestion 
of DNA is a partial digestion and wherein said digestion generates quasi-random 
fragments of DNA without apparent site preference as seen on a 1 -2 wt. % agarose 
gel. 

20. The method according to claim 18 wherein said digestion of 
DNA is a partial digestion and wherein said digestion generates quasi-random 
fragments of DNA without apparent site preference as seen on a 1-2 wt. % agarose 
gel. 



21. The method according to claims 16 or 18 wherein said 
digestion is complete, and wherein said digestion generates DNA fragments from 
about 20 base pairs in length to about 200 base pairs in length and wherein said 
fragments have an average length of about 20 to about 60 nucleotides. 

22. The method according to claims 19 or 20 wherein said quasi- 
random fragments are from about 100 basepairs to about 10,000 base pairs in 
length. 
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23. A method for shotgun cloning and sequencing DNA, 
comprising the steps of: 

a) partially digesting DNA according to claims 19 or 20; 

b) ligating said partially digested DNA into a linearized 
cloning vector thereby creating a recombinant vector; 

c) introducing said recombinant vector into a host cell; 

d) selecting said host cell for the presence of said recombinant 
vector; 

e) growing and amplifying said host cell containing said 
recombinant vector; 

f) isolating and purifying said recombinant vector from said 
grown and amplified host cells; and 

g) sequencing said DNA contained in said recombinant vector. 

24. The method according to claim 23 wherein said restriction 
endonuclease reagent comprises CviJ I. 

25. The method according to claim 23 wherein said restriction 
endonuclease reagent comprises CGase I. 

26. The method according to claim 23 wherein said quasi-random 
fragments are from about 100 base pairs to about 10,000 base pairs in length. 

27. The method according to claim 23 wherein said quasi-random 
fragments are from about 500 bp to about 2,000 bp in length. 



28. The method according to claim 23 wherein said cloning vector 
is selected from the group consisting of plasmids, phage, and cosmids. 
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29. The method according to claim 28 wherein said plasmid is 

pUC19. 

30. The method according to claim 28 wherein said bacteriophage 



31 . The method according to claim 28 wherein said bacteriophage 



32. The method according to claim 23 wherein said host cell is a 
bacteria. 

33. The method according to claim 32 wherein said host cell is E. 

coli. 

34. The method according to claim 23 wherein said sequencing is 
dideoxy sequencing. 

35. A kit for the shotgun cloning of DNA, said kit comprising in 

association: 

a) a restriction endonuclease reagent, according to 
claims 16 or 18; 

b) a restriction endonuclease buffer; 

c) ligation buffer; and 

d) T4 DNA ligase. 



is X. 



is M13. 
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36. The kit of claim 35 further comprising in association: 

e) competent host bacteria; 

f) chromatography matrix said matrix useful for the size 
selection of restriction endonuclease digested DNA; 

g) spin filters, said spin filters useful for the size selection of 
restriction endonuclease digested DNA; 

h) a cloning vector; 

i) positive control DNA useful in the monitoring of the 
efficiency of the said shotgun cloning; and 

j) molecular size marker DNA. 

37. The kit according to claim 35 wherein said restriction 
endonuclease reagent comprises CviJ L 

38. The kit according to claim 37 wherein said restriction 
endonuclease buffer endonuclease buffer is CviJ I** buffer. 

39. The kit according to claim 35 wherein said restriction 
endonuclease reagent comprises CGase I. 

40. The kit according to claim 39 wherein said restriction 
endonuclease buffer is CGase I buffer. 

41. The kit according .to claim 36 wherein said competent host 
bacteria is competent E. coli DH5aF'. 

42. The kit according to claim 36 wherein said chromatography 
matrix is Sephacryl-S500. 
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43. The kit according to claim 36 wherein said cloning vector is 

M13 mpl8. 

44. A method for labeling DNA, the method comprising the steps 

of: 

a) digesting an aliquot of template DNA with a restriction 
endonuclease reagent according to claim 21 and wherein said 
digestion generates sequence-specific DNA fragments; 

b) mixing an aliquot of undigested template DNA with said 
sequence-specific DNA fragments, denaturing said mixture of 
template DNA and sequence-specific DNA fragments thereby 
generating denatured template DNA and oligonucleotide primers. 

c) annealing said primers to said denatured undigested template 
DNA to form a DNA-primer complex; 

d) performing an extension reaction from said primers in said 
DNA-primer complex using a DNA polymerase in the presence of 
one or more nucleotide triphosphates and wherein at least one 
nucleotide triphosphate has a label. 

45. The method according to claim 44 wherein said restriction 
endonuclease reagent comprises CvLT I. 

46. The method according to claim 44 wherein said restriction 
endonuclease reagent comprises CGase L 



47. The method according to claim 44 wherein said extension 
reaction is performed by a DNA polymerase. 
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48. The method according to claim 47 wherein said DNA 
polymerase is Thermus flams DNA polymerase. 

49. The method according to claim 44 wherein the one or more 
nucleotide triphosphates are selected from the group consisting of dATP, dCTP, 
dGTP, dUTP and dTTP. 

50. The method according to claim 44 wherein said labeled 
nucleotide triphosphate is selected from the group consisting of 32 P-labeled 
nucleotide triphosphates and 33 P-labeled nucleotide triphosphates. 

51. The method according to claim 44 wherein said labeled 
nucleotide triphosphate is selected from the group consisting of biotin-labeled 
nucleotide triphosphates, florescein-labeled nucleotide triphosphates, 
dinitrophenol-labeled nucleotide triphosphates, and digoxigenin-labeled nucleotide 
triphosphates. 
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52. A method for thermal cycle labeling DNA comprising the 

steps of: 

a) digesting an aliquot of template DNA with a restriction 
endonuclease reagent according to claim 21 and wherein said 
digestion generates sequence-specific DNA fragments; 

b) mixing an aliquot of undigested template DNA with said 
sequence-specific DNA fragments, denaturing said mixture of 
template DNA and said DNA fragments thereby generating 
denatured template DNA and oligonucleotide primers; 

c) annealing said primers to said denatured undigested template 
DNA to form a DNA-primer complex; 

d) performing an extension reaction from said primers in said 
DNA-primer complex using a DNA polymerase in the presence of 
one or more nucleotide triphosphates and wherein at least one 
nucleotide triphosphate has a label. 

e) heat-denaturing said labeled extension products; 

f) reannealing said excess primers with said template DNA 
and with said extension products; 

g) performing at least one additional extension reaction from 
said DNA-primer complex using a DNA polymerase. 

53. The method according to claim 52 wherein said restriction 
endonuclease reagent comprises CviJ I. 



54. The method according to claim 52 wherein said restriction 
endonuclease comprises CGase I. 



55. The method according to claim 52 wherein said DNA 
polymerase is a heat stable DNA polymerase. 



WO 94/21663 PCT/US94/03246 



1 04 



56. The method according to claim 55 wherein said heat-stable 
DNA polymerase is Thermus flavus DNA polymerase or a functional fragment 
thereof. 

57. The method according to claim 52 wherein said extension 
products also serve as templates. 

58. The method according to claim 52 wherein said label is 
selected from the group consisting of fluorescein, dinitrophenol, biotin, and 
digoxigenin. 

59. The method according to claim 52 wherein said label is 
selected from the group consisting of 32 P, 33 P, 3 H, 14 C, and 35 S. 

60. The method according to claim 52 wherein steps e)-g) are 
repeated up to 20 times. 

61. A kit for labeling DNA, said kit comprising in association: 

a) a restriction endonuclease reagent, according to 
claims 16 or 18; 

b) a restriction endonuclease buffer; and 

c) a labeling buffer. 

62. The kit according to claim 61 wherein said restriction 
endonuclease reagent comprises CviJ I. 



63. The kit according to claim 62 wherein said restriction 
endonuclease buffer is CviJ I* restriction endonuclease buffer. 
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64. The kit according to claim 61 wherein said restriction 
endonuclease reagent is selected from the group consisting of CGase I and Aci I. 

65. The kit according to claim 64 wherein said restriction 
endonuclease buffer is CGase I buffer. 

66. The kit of claim 64 further comprising: 

d) a concentrated mixture of l or more nucleotide 
triphosphates; 

e) a DNA polymerase; 

f) control DNA, said control DNA being useful for monitoring 
the efficiency of labeling. 

67: The kit according to claim 66 wherein said nucleotide mixture 
is an equimolar mixture of one or more nucleotides selected from the group 
consisting of dCTP, dTTP, dATP, and dGTP. 



68. The kit according to claim 66 additionally comprising a labeled 
nucleotide selected from the group consisting of biotin-1 1-dUTP, digoxigenin-11- 
dUTP and fluorescein-ll-dUTP. 

69. The kit according to claim 66 additionally comprising a labeled 
nucleotide selected from the group consisting of 32 P-labeled nucleotides, 33 P- 
labeled nucleotides, 14 C-labeled nucleotides, 35 S-labeled nucleotides, and 3 H- 
labeled nucleotides. 

70. The kit according to claim 66 wherein said DNA polymerase 
is the Klenow fragment of DNA polymerase 1. 
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71. The kit according to claim 66 wherein said DNA polymerase 
is a thermostable DNA polymerase. 

72. The kitaccording to claim 66 wherein said thermostable DNA 
polymerase is Thermus flavus DNA polymerase. 

73. A method for universal thermal cycle labelling DNA 
comprising the steps of: 

a) mixing an aliquot of template DNA with a holo- 
enzyme of a thermostable DNA polymerase, whereby the 
polymerase provides endogenously purified DNA primers; 

b) denaturing said mixture of template DNA and said 
endogenous DNA primers; 

c) annealing said mixture of denatured template DNA 
and said endogenous DNA primers to form a DNA-primer 
complex; 

d) performing an extension reaction from said 
endogenous DNA primers in said DNA-primer complex 
using said DNA polymerase in the presence of one or more 
nucleotide triphosphates and wherein at least one nucleotide 
triphosphate has a label; 

e) heat-denaturing said labeled extension products; 

f) reannealing said endogenous primers with said 
template DNA and with said extension products; 

g) performing at least one additional extension reaction 
from said DNA-primer complex using a DNA polymerase. 
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74. The method according to Claim 73 wherein said heat-stable 
DNA polymerase is Thermus flavus DNA polymerase or a functional fragment 
thereof. 

75. The method according to claim 73 wherein said extension 
products also serve as templates. 

76. The method according to claim 73 wherein said label is 
selected from the group consisting of fluorescein, dinitrophenol, biotin, and 
digoxigenin. 

77. The method according to claim 73 wherein said label is 
selected from the group consisting of 32 P, 33 P, 3 H, 14 c, and 35 S. 

78. The method according to claim 73 wherein steps e)-g) are 
repeated up to 20 times. 

79. A kit for labeling DNA, said kit comprising in association: 

a) a holo-enzyme of a thermostable DNA polymerase; 
and 

b) a DNA polymerase buffer. 

80. The kit of claim 79 further comprising: 

c) a concentrated mixture of 1 or more nucleotide 
triphosphates; 

d) control DNA, said control DNA being useful for monitoring 
the efficiency of labeling. 
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81. The kit according to claim 80 wherein said nucleotide mixture 
is an equimolar mixture of one or more nucleotides selected from the group 
consisting of dCTP, dTTP, dATP, and dGTP. 

82 . The kit according to claim 80 additionally comprising a labeled 
nucleotide selected from the group consisting of biotin- 1 1-dUTP, digoxigenin-1 1- 
dUTP and fluorescein- 11-dUTP. 

83. The kit according to claim 80 additionally comprising a labeled 
nucleotide selected from the group consisting of 32 P-labeled nucleotides, 33 P- 
labeled nucleotides, 14 C-labeled nucleotides, 35 S-labeled nucleotides, and 3 H- 
labeled nucleotides. 

84. The kit according to claim 80 wherein said thermostable DNA 
polymerase is Thermits aquaticus DNA polymerase. 

85. The kit according to claim 80 wherein said thermostable DNA 
polymerase is Thermits flaws DNA polymerase. 

86. A method for labeling of restriction-generated oligonucleotides, 
the method of comprising the steps of: 

a) digesting an aliquot of template DNA according to 
claim 21; 

b) heat denaturing said digested DNA thereby generating 
sequence-specific oligonucleotides; and 

c) labeling said sequence-specific oligonucleotides with 
a label capable of detection. 
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87. The method according to claim 86 wherein said restriction- 
generated oligonucleotides are labeled on the 5' end. 

88. The method according to claim 86 wherein said restriction- 
generated oligonucleotides are labeled on the 3' end. 

89. The method according to claim 86 wherein the label is 

radioactive. 



radioactive. 



90. The method according to claim 86 wherein the label is non- 



WO 94/21663 PCT/US94/03246 

1 1 0 



91. A method for anonymous primer cloning, the method 
comprising the steps of: 

a) digesting an aliquot of template DNA according to claim 21 
thereby generating anonymous DNA fragments; 

b) digesting a plasmid cloning vector with a restriction 
endonuclease thereby creating a cloning site for insertion of said 
anonymous DNA fragments; 

c) ligating the anonymous DNA fragments of step a) into the 
cloning site of step b) thereby creating recombinant plasmids; 

d) transforming' competent bacteria with the recombinant 
plasmids; 

e) selecting trasformed colonies; 

0 purifying the recombinant plasmids from said transformed 
bacteria; 

g) digesting the recombinant plasmid with a restriction 
endonuclease said restriction endonuclease being capable of cutting 
said recombinant plasmid at a site, said site lying within the cloned 
anonymous DNA fragment; 

h) annealing one or more extension primers to the digested 
recombinant plasmid, said extension primers being complementary 
to plasmid sequences flanking the anonymous primer; 

i) extending the extension primer in a template-dependent 
fashion in the presence of one or more nucleotide triphosphates and 
a DNA polymerase; and 

j) denaturing the said hybridized extended primer. 

92. The method according to claim 91 wherein said restriction 
endonuclease reagent comprises CviJ I. 
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93. The method according to claim 91 wherein said restriction 
endonuclease reagent comprises CGase I. 

94. The method according to claim 91 wherein said plasmid 
cloning vector is pFEM. 

95. The method according to claim 94 wherein the restriction 
endonuclease of step b) is Eco RV. 

96. The method according to claim 91 wherein said extension 
primer has a label capable of detection. 

97. A kit for anonymous primer cloning comprising in association: 

a) a restriction endonuclease reagent, according to claims 16 or 
18; 

b) a restriction endonuclease buffer; 

c) a cloning vector; 

d) competent bacteria; 

e) one or more extension primers said extension primers being 
complementary to plasmid sequences flanking said anonymous 
primers; and 

f) a DNA polymerase reagent. 

98. The kit according to claim 97 wherein said restriction 
endonuclease reagent comprises CvLT I. 

99. The kit according to claim 98 wherein said restriction 
endonuclease buffer is CvLT I* buffer. 



W ° 94/21663 PCT/US94/0324* 



1 1 2 



100. The kit according to claim 97 wherein said restriction 
endonuclease reagent is selected from the group consisting of CGase I and Aci I. 

101. The kit according to claim 100 wherein said restriction 
endonuclease buffer is CGase I buffer. 



102. The kit according to claim 97 wherein said cloning vector is 

pFEM. 
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lacZ* 

TAACAATTTCACACASfiAAAOAftHT ATG ACC ATG ATT ACG CCA AGC TCG AAA TTA 

MTMI TPSSK I 




ACG GTA TCG ATA AGC TTG ATA AAC CAT TTA TAC AATAAG CGT TGA TATAAGTTT 
•VSI SLINHLYNKR* 



GTATATACGTCATTTCGTTATATCAACAA ATG TTA TCA TAT TAT ACG TAA AACTGGCT 

M I S Y Y T • 



TAAAAAAAAACGAGGTGTAACTATA ATG TCT TTT CGC ACP, TTA QA& r ™ tTT ... 




Amp R 
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104! ATfinTfiyfi pft 
1121 AAATGGATAT 
201 AAAAGAATTG 
281 A6AGAAAAAA 
136? TCGTGGAAGA 
1441 AGAAAGAGGA 
1521 TACTTTTGTA 
1601 GTAAAGCGAA 
1681 GGTAAAGATG 
1761 GTTCACAGGA 
1841 CTGCTAATAA 
1921 AAACCACGAA 
2001 ATTCACTGGT 
2081 AACGGAGTAG 
2161 TACAACTTTG 
2241 ATGGTCACGA 
2321 CATCTACATG 
2401 TTAGTTGTAT 
248 CGCATGAGAA 

?!? "ATATTGAG 
264 AGTTCATTCG 
2721 TCCAT6TGTG 
2801 TGTGAACTTC 
2881 ATCTTCTGCA 
2961 TATAACATTA 
3041 CGTTACTATA 
3121 CCTCATTGAA 
3201 ATGGTAATGA 
3281 CTTCATCCTG 
3381 TAATTTGGGA 
3441 ATAATGTTGA 
3521 ACGCTTATTG 
3601 CCACGATGCA 
3681 ATTACGCGGC 
3761 TGfGATTGCA 
222! £ AT CAACGCC 
392 TAGTCAATAA 
4001 GATGTGTTGC 
4081 6GATACTCTA 
J 161 GTGTAAAGTT 
4241 TGTATTCGT6 
4321 TAGTAGTATC 
J401 TTCAAAACGA 
4481 AAAAGCATAC 
4561 GATCTGTATA 
4641 TTTCCACCGA 
4721 CTTTCCTCTG 
4801 GTGGATTAAC 
4881 GCTGGAAGGG 
4961 TGGTGGTTTC 
5041 GTTTCGTCGA 
5121 GTCGAAGGTG 
520) CGGAAGTGGG 
5281 GCATTCTTAT 
5361 TTGCCTTTAC 
5441 CGTATTATAT 
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CGGAAGAGAA 
CGATTCGCAC 
GAAAAAAAGA 
TCTCTACAAA 
TTCGTAGATA 
ATCCCCGACC 
TGGCATGGAT 
AAAGAATTAG 
GACCGTATCG 
GGGACAATGT 
TTTAGCGCAT 
TAGCGGGAGA 
TTTCTTCAAA 
TATTTGTCTA 
TGGTGATATT 
ACTACTTTTG 
TAATATCCAT 
ATCATCGTCA 
CCAATTGTGC 
GTGTGTACGA 
TTCAGATCTA 
CACGAACAAC 
TAATTGTTGA 
TAATGGATAA 
AGAGGTAGCA 
TGTGTGTTTG 
CCGTTTTGAA 
TGGATACCG7 
CGTTATCAAT 
ATGCGGGTGC 
TGTGTGAAGA 
TTTTAATGGT 
CGGCGTTACA 
GTCAACAAAT 
GATCGTTTCG 
GAACGACGAT 
CGAACTACTG 
ACGCTATTTT 
CAATAGAGAC 
TAAATTCAAC 
TAGTAACCCA 
AACGATCCTG 
CATATCACTT 
GACCGGACAT 
AACCGAAGAC 
TCGAGATTCG 
AAGGTAAAAC 
GTCGAAGGTG 
AGGTGGTTTC 
GTTTCGTCGA 
GCATGACCAT 
TGGTTCTGTA 
ATGACATTAT 
CAATGCATCA 
I 20 



AAAAAGAATT 
TTGAAGAGAA 
CTTGCACTTA 
TGCAACAAAA 
TAAAAGGTAA 
GCGTATGTTG 
ATCCCATAAA 
AAGAAGTTCT 
TCTCCTATCA 
AGACATCATA 
TAAATGGGCA 
AGTATAACAA 
AACACAACGT 
AAGATGCTCA 
TGTATAAACG 
TATAA6ACCT 
TATGGATGTT 
CGTTTTTCTT 
CGTGACACCA 
CCACACCGTT 
TTATTAATCG 
ATTCGTCAAA 
TATGATTATT 
CATTTGCAAT 
ATATCAATGA 
AAGATGCTTA 
TGGCCATGAC 
TACATTACGC 
GATCGCGGTT 
AAATCTTGAC 
TACTCGTAGA 
AATGATGCGA 
TTACGCGGCT 
CGGGGGATAC 
GAGCGGCCGT 
GCGGCTTCAT 
CGTTGT6TTT 
TTTCCAAAAA 
CATACGTACC 
CCTTTGAACT 
TTGACCTCTA 
TAAGGTTATC 
GGTTCGAAAT 
TTCAGCACGA 
CATGCATCGT 
TCAAATCTAA 
TTTAGGATAT 
CTTTCGTTGA 
GTCGAAGGTG 
AGGTGGTTTC 
AATCCGTTAA 
GTATCAGATA 
TTGTAA TATA 
TCTTAATCAT 
I 30 



GCACT7ATAG 
AAAACGAATT 
TAGAAAAACA 
CATGAAAGAG 
TAAATCCAGA 
ATAGAGAATA 
GCATCTGAAG 
ATCGTTCAAG 
AGTCAAATTT 
GGTCAAGGAC 
CTTGGAGAAT 
CTGTCGTCAA 
ATTATGTAGft 
TAGAAGCAGG 
GTAAATACCT 
GTAAGTTACA 
TTCTGCTAAT 
TACCGTATTT 
AATCTCTCAC 
ATAACTATAA 
GATCTGATCC 
TTTCTGTGAT 
ACGTTTCATA 
AGTATATTCA 
TGTTTCCGAA 
TTGACGCAGG 
ATATGTGTAC 
GGCTTTTAAT 
GGACGGCGTT 
ATCACAGATA 
AGCAGGTGCA 
TTTTGAGGAT 
CGAAATGGAC 
ACCACTAGAT 
TGCGTCCGAG 
GGGCGATCGG 
GAACCGAACA 
GGGTTTGCAT 
TCCAAATTCA 
CATCGCCATT 
GGAATTTTAA 
CCCAGAACC7 
GAAAATCGTA 
GCCTTGTAAG 
TATACCTGGT 
AATATGATAA 
TTCAAATCAC 
AGGTGGTTTC 
GTTTCGTCGA 
GTCGAAGGTG 
ATTCCCGCAT 
TACATACGAA 
ATATTATT A 7 
TTCCTAACG- 



AAAAACAACC 
GCGGAAGAAA 
ACGAATTGCG 
AATTTGTCAA 
GAAATCCACA 
TAACAAACCT 
CATATCAACA 
AGAAAAGTAG 
GATTAAAAAT 
GACCAATTAT 
TTTACTGGGA 
TGGTGTCACT 
ACCATTTTCC 
TGCAAACCTT 
ATATATACAA 
AACTAAAACT 
AAAACGATAT 
TACTTTCGTG 
AACAACCTTG 
CACGTGTAGT 
ATAAGAAGAA 
GACGAATCTC 
TCAACAAAAT 
CTGCAGTAAA 
TCAAAATATG 
TGCAAACCTT 
ACATGCTCGT 
GGTAATGATG 
ACATTACGCG 
TTTCGGGATG 
ACTCTTGACG 
CCTCATTGAA 
ACGATGTGTG 
ATTGCAGCAT 
T6AGTTGTGT 
AAGCTGCAAA 
ATTTCCGAGA 
GAAATACAAC 
TTTACTTTAC 
AACAGACAGA 
CCGATCTTAT 
G AAA TTGTAA 
GTCCCAATTA 
AATGATATGA 
GCAACCTGTA 
CGATGTTCCA 
CAACACCTTG 
GTCGAAGGTG 
AGGTGGTTTC 
GTTTCGTCGA 
TCACCTAATG 
A'AATGAGAA 

J'AAGCGTAG 
50 



AATTGCGGAA 
AAAAACGAAT 
GAAGAGAAAA 
AGTTATAAAT 
ACGTTGTAAG 
AAAGCGGATA 
ATATCTAAAA 
TTAGTATGGC 
CAAGCAATAT 
AACAAAAAGA 
AACATGAACC 
TATAAAAATT 
CGAGAGACTT 
GACATCGTCA 
TACGTATCCC 
TTCAGCTTTG 
TTCCTACAGA 
ATCGTCGCAC 
ATGTCCATCC 
TGTCGTCTAT 
TCTTCATATT 
CATCTCTGAA 
ACATATAAAC 
AAATGGCCAC 
GAAATACACC 
GATATCACAG 
AGAAGCAGGT 
CGATTTTGAG 
GCTTTTAATG 
TACACCACTT 
TCATTGATGA 
GCAGGTGCAG 
TATAAAAACA 
GTCATGACAT 
GTCATACCAC 
GATCACAGCG 
GATCTCGTTG 
ACGATCTTTT 
CTACAGTATT 
GCGTATGAAC 
AAGTATCTGC 
AGAACGACTG 
GGTACGTTCC 
TGTGGTTAAA 
CTAAATTCTT 
ACAGTAGAAC 
AGGGTTTACT 
GTTTCGTCGA 
GTCGAAGGTG 
AGGTGGTTTC 
ATCTACTCCA 
TCATTTTCCC 
AAACTAACGT 
CGAATTC 
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A AAA GAG ACT 
GAGAAAAAAA 
CGCGGAAGAG 
TTGCGTCGGG 
TCAATGTTCG 
ATTCAGACAA 
TAGCAGCGGT 
ATTTCTGGAA 
ACCGGTATCT 
TCGGATTTGA 
CGTTCCATAT 
CGTTTTCTAT 
TAAGATTCTT 
TGTTGACCGC 
GTGTTGAGTA 
CCTAAAAGCG 
CCTTCGAAAC 
AGTTTCTATG 
CAATAAAATC 
ATTGCTAACA 
ATCATATAAC 
TACAAATAAA 
TCATTAGAGA 
ACCATACAAA 
GAACCTTGTT 
ACTACATATT 
ATATTTCTGG 
GCAAACCTTA 
GATGCTCATC 
GTCATAGCAT 
CATCGTGCGG 
TACTGAGTGG 
ATATTGATAT 
CTCATCGAAG 
TGCAGTATGT 
CAACGTCTGC 
CATCTTCCTG 
ATAGTGTATT 
GTAGATCGTT 
ACCACTTCCT 
CGTTTTGTGC 
TTACTTCCAA 
GAAATGAATA 
ACCAAGTTTA 
TCTCTATCAC 
TATTTCAGGT 
CACTGGGTGG 
TGAATACTTC 
AGGTGGTTTC 
GTTTCGTCGA 
GTCGAAGGTG 
TAAAGAACCG 
TGCCAAATAA 
CTATTTAAAA 



TTGATACACT 
TGAAGAGAAA <tw 
GAATTGCGGA 1280 
AAAAAACGAA 1360 
GAGAAAAATT \tldQ 
TCGGACCCGC 1520 
TTACAAGGCA 1600 
AGACATAACC 1680 
AGAACCTCAA 1760 
AAAATATGGC I8«0 
TTACGGTAAG 1920 
TATATCTTAC 2000 
GTAAGAACAG 2080 
TATACATCCA 2160 
GTACTAAAAA 22«0 
TACACCATTA 2320 
CTTAGATTTT 2400 
AAGCAATTAC 2*80 
ATTAGTTCCG 2560 
ATCTCGTGTG 26«0 
CTATCGGTAA 2720 
TCGAGAGCGG 28CX) 
ATCATCCGAT 2880 
CTTGCGAGTA 2960 
TATTAAAACA 30^0 
TGAAGATGAT 3120 
GCAGCTCATC 3200 
AGGAACACCA 3280 ' 
GTATCATAAC 3360 
GTTGTAAGTG 3«tG 
GTGCGTCAAG 3520 
TTTATAATGA 3600 
GTGCCCTTAC 3680 
ATCTAATATA 3760 
CAGGTGGTAA 38^0 
GTGATCGTGA 3920 
TGCATTAGGT 4000 
TGGGTGCAAG 4080 
AATTGAATGC a 160 
TACCATTAGT 42M3 
TTTTTTCCTA 4320 
CAATTTCACC 4400 
GTCCTTTTTC 4480 
GGTTGCATTA 4560 
ATACGGGGTC 4640 
CATCGTTCCA 4720 
GCGGCTCCGG 4800 
TATGGCAGTT 4680 
TGGGAGATGT 4960 
GTCGAAGGTG 50*0 
AGGTGGTTTC 5120 
GTTTCGTTGG 52CC- 
GGTGCGCATT 528C 
TTTACCAGAT 5360 
TTATGTAATA 5<.-C 
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1 CAA GAA TAT CTT GGT TAT TTG GTT CAA TAT GAT TAA AATATTTTGATACACTAA ATG GAT ATA 

68 AGA AGA AAA CGT TTT ACA ATA GAA GGG GCT AAA CGT ATA ATA CTC GAA AAA AAG AGA CTT 
r r k r f t ■ o g a k r i i | e k k r I 

128 GAA GAG AAA AAA AGA ATT GCG GAA GAG AAA AAA AGA ATT GCA CTT ATA GAA AAA CAA CGA 
•• -kkriaeekkrial i « k q r 

188 ATT GCG GAA GAG AAA AAA AGA ATT GCG GAA GAG AAA AAA CGA TTC GCA CTT GAA GAG AAA 
•ae*kkriaeekK rfa!e«k 

248 AAA CGA ATT GCG GAA GAA AAA AAA CGA ATC GCG GAA GAG AAA AAA CGA ATC G^f GAA GAG 
kriae»kkrio««kk ri v r r 

308 AAA AAA AGA CTT GCA CTT ATA GAA AAA CAA CGA ATT GCG GAA GAG AAA ATT GCG TCG GGG 
1 * 8 1 A 1 1 E K__Q B 1 A £ E K I A S G 

368 AGA AAA ATT AGA AAG AGG ATC I TCT ACA AAT GCA ACA AAA CAT GAA AGA GAA TTT GTC AAA 
R < I R < R I IS ^ T NATKHEREFVK 

428 GTT ATA AAT TCA ATG TTC GTC GGA CCC GCT ACT TTT GTA TTC GTA GAT ATA AAA GGT AAT 
VlNSrtFVGPATFVFVOlKGN 

488 AAA TCC AGA GAA ATC CAC AAC GTT GTA AGA TTC AGA CAA TTA CAA GGC AGT AAA GCG AAA 

KSREIHNVVRFR QLQG SKAK 
548 TCC CCG ACC GCG TAT GTT GAT AGA GAA TAT AAC AAA CCT AAA GCG GAT ATA GCA GCG GTA 

s f tay voreynkpkao iaav 

608 GAC ATA ACC GGT AAA GAT GTG GCA TGG ATA TCC CAT AAA GCA TCT GAA GGA TAT CAA CAA 
Dl T G K 0 V A W I S H K A J* E GYQO 

668 TAT CTA AAA ATT TCT GGA AAG AAC CTC AAG TTC ACA GGA AAA GAA TTA GAA GAA GTT CTA 
Y| -<lSGKNLKFTGKELEEVl 

728 TCG TTC AAG AGA AAA GTA GTT AGT ATG CCA CCG GTA TCT AAA ATA TGG CCT GCT AAT AAG 
5 F K R K y V SMAPVSK I WPAN K 

788 ACC GTA TGG TCT CCT ATC AAG TCA AAT TTG ATT AAA AAT CAA GCA ATA TTC GGA TTT GAT 
TVw $PIKSNLl K NOAlFGFD 

848 TAC GGT AAG AAA CCA GGA AGG GAC AAT GTA GAC ATC ATA GCT CAA GGA CGA CCA ATT ATA 
'GKKPGRDNVDllGQGRPl I 

908 ACA AAA AGA GGT TCC ATA TTA TAT CTT ACA TTC ACT GGT TTT AGC GCA TTA AAT GGG CAC 
•KRGS lt-YLT FTGF$ALNGH 

968 TTG GAG AAT TTT ACT GGG AAA CAT GAA CCC GTT TTC TAT GTA AGA ACA GAA CCG AGT AGT 
UENFTGKHEPVFYV RTERSS 

1028 AGC GGG AGA ACT ATA ACA ACT GTC GTC AAT GGT GTC ACT TAT AAA AAT TTA AGA TTC TTT 
SGRSITTVVNGVTY KN LR FF 

1088 ATA CAT CCA TAC AAC TTT GTT TCT TCA AAA ACA CAA CGT ATT ATG TAG CACCATTTTCCCGAG 
1 n T ' M ■ r VSSK TOR { ft • 

1152 AGACTTTGTTGACCGCGTACTAAAAAATGGTCACGATA7'::TCTAAAGATGCTCATAGAAGCAGGTGCAAACCTTGAC 
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Primer 2 Cl oned Anonymous Prim er 

' SI^^GACGC ^f^^ 

CATTTTGCTGCCGGrCACTTAAGCGCTTCTAYYYYYYYYYYYYYYYYYTAGT^ 

Mbo II | Primer 1 



j 



MboW Digest (or Fok I) 
Denature DNA 

Anneal End Labeled Primer 1 (or Primer 2) 



XXXXXXXXXXATCATCCAAGCTTGGCACTGGCCGTCGTTTTAC 

TGACCGGCAGCAAAATG " 



J 



DNA Polymerase 
dNTPs 



XXXXXXXXXXATC ATCC AAGCTTGGCACTGGCCGTC G TTTT A C 
YYYYYYYYYYTAGTAGGTTCGAACCGTGACCGGCAGCAAAATG' 



i 



Denature and Separate Primer from Vector 
Labeled Anonymous Primer Ready for Cosmid Sequencing 
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BOX U. OBSERVATIONS WHERE UNITY OF INVENTION WAS LACKING 
This iSA found multiple inventions as follows: 

I. Claims Ml, drawn to a polynucleotide encoding CvUI, the vector carrying said polynucleotide, the transformed or 
transfected host carrying said vector, and method for producing a CviJI polypeptide from said host, classified in Class 
536, subclass 23.72, for example. 

II. Claims 12 and 13, drawn to the recombinant CviJI polypeptide, classified in Class 530, subclass 350, for example. 
ID. Claims 14, 16. 19, 21, and 22, drawn to a method for restriction endonuclease digestion using CviJI, classified in 
Class 435, subclass 6, for example. 

IV. Claims 14, 15, 17, 18, and 20-22, drawn to CGase I restriction endonuclease and a method for using it in 
restriction endonuclease digestion, classified in Class 435, subclass 6, for example. 

V. Claims 23, 24, 26-38, and 41-43, drawn to a method for shotgun cloning after partial digestion using CviJI, 
classified in Class 435, subclass 172.3. 

VI. Claims 23, 25-35, and 39-43, drawn to a method for shotgun cloning after partial digestion using CGase I, 
classified in Class 435, subclass 172.3. 

VH. Claims 44, 45, 47-53, and 55-63, drawn to a method of extension labeling of DNA and thermal cycle labeling 
using CvUI, classified in Class 435, subclass 91.1, for example. 

Vm. Claims 44, 46-52, 54-61, and 64-72, drawn to a method of extension labeling of DNA and thermal cycle labeling 
using CGase I, classified in Class 435, subclass 91.1, for cxample.IX.Claims73-85, drawn to a universal 
thermalcyclelabeling of DNA, classified in Class 435, subclass 91.1, for example. 

X. Claims 86-90, drawn to a method of cad labeling after CviJI digestion, classified in Class 435, subclass 91.53. 

XI. Claims 86-90, drawn to a method of end labeling after CGase I digestion, classified in Class 435, subclass 91.53. 

XII. Claims 91, 92, and 94-99, drawn to a method for anonymous primer cloning after digestion with CvUI, classified 
in Class 435, subclass 172.3, for example. 

XIII. Claims 91, 93, 94-97, and 100-102. drawn to a method for anonymous primer cloning after digestion with CGase 
I, classified in Class 435, subclass 172.3, for example. 

Detailed Reasons for Lack of Unity 

PCT Rule 13 recites the basic principle of unity of invention that an application should relate to only one 
invention or, if there is more than one invention, that applicant would have a right to include in a single application 
only those inventions which are so linked as to form a single general inventive concept. According to Rule 13, a group 
of inventions is linked to form a single inventive concept where there is a technical relationship among the inventions 
that involves at least one common or corresponding special technical feature that defines the contribution which each 
claimed invention, considered as a whole, makes over the prior art. 

The thirteen inventions of this application consist of: 

1) a polynucleotide encoding CvUI, the vector comprising k, thetrans formed host carrying the vector, and a method of 
making the protein using the vector, 

2) the recombinant peptide CvUI, 

3) a method for restriction endonuclease digestion using CvUI, 

4) CGase I restriction endonuclease and a method for using it in restriction endonuclease digestion, 

5) a method for shotgun cloning after partial digestion using CvUI, 

6) a method for shotgun cloning after partial digestion using CGase I, 

7) a method of extension labeling of DNA and thermal cycle labeling using CvUI, 

8) a method of extension labeling of DNA and thermal cycle labeling using CGase I, 

9) a universal thermal cycle labeling of DNA, 

10) a method of end labeling after CvUI digestion, 

11) a method of end labeling after CGase I digestion, 

12) a method for anonymous primer cloning after digestion with CvUI, and 

13) a method for anonymous primer cloning after digestion with CGase I. 

The thirteen inventions are not linked by a special technical feature within the meaning of PCT Rule 13 for 
the following reasons: Those claims drawn to CvUI arc not linked to those claims drawn to CGase I because there 
is no technical relationship among these inventions that involves at least one common or corresponding special technical 
feature. " 

The claims that involve the polynucleotide encoding CvUI, the vector containing it, the host carrying the 
vector, and methods of making recombinant protein are not linked to the recombinant protein because the protein and 
polynucleotide share a technical relationship that involves a corresponding technical feature that does not define the 
contribution which each claimed invention, considered as a whole, makes over the prior art because cloning and 
expression of polynucleotides to make recombinant polypeptides are well known in the art. Accordingly, such docs not 
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constitute a special technical feature within the meaning of PCT Rule 13.2. 

The ssxhoda for restriction endonuclease digestion, shotgun cloning and sequencing with r«itr <x. „ 
and thernul cycle Ubetog with CvUl, for univeml cycle labelling IT™" 
anonymous primer cloning after CvUI digeation involve a corapLu^h^aaS d^S^ r^i 
doc. not define the. contribution which each claimed invention, considered aa . wJwknW^^ '^f 
Action ^nuclease dige*o„, «d ritotgun cloning «d sequencmg, cZl^ ^Z^^^T 
unrverwl cycle labeUmg. end labeling, and anonymous primer cloning after reaction enZucSc dtSna^' 
known m the art. In addition. Cvill it also known in the art. Accordingly such doe. 8 , 

feature within the meaning of PCT Rule 13.2. Accordingly, such doe. not consume a rpecul technical 

. _ Similarly the method, for restriction endonuclcase digestion, shotgun cloning and sequencing with CGaael 
for rtman , «d thermm cycle labeling with CGasel. for umverwl cycle UWfing. for end KJXSS^' 
d,gesuo„. and for anonymous primer cloning after CGasel digcrion involve a cor^ndmg^Snlcl^mre^e^n 
wuh CGasel ttat doe. no. define the contribution which e«h claimed invention. ^ZS*TS£^*£Z 
pnor art because rcrtriction eodonuclcMe digestion, and shotgun clomog and sequeacinTc^^n^'th^S cvl 
Jbdrng after re^univeraal cycle Ubelling. end labeling, and anonymous prinWckminJ SnSStSlSS. 
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